From 923e4b6a72e5643fb2373a62e8563827a51520dc Mon Sep 17 00:00:00 2001 From: James Smart Date: Thu, 4 Dec 2008 22:40:07 -0500 Subject: [SCSI] lpfc 8.3.0 : Hook lpfc's debugfs into Kconfig Signed-off-by: James Smart Signed-off-by: James Bottomley --- drivers/scsi/Kconfig | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'drivers/scsi/Kconfig') diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig index 403ecad48d4..1badcec18f4 100644 --- a/drivers/scsi/Kconfig +++ b/drivers/scsi/Kconfig @@ -1357,6 +1357,13 @@ config SCSI_LPFC This lpfc driver supports the Emulex LightPulse Family of Fibre Channel PCI host adapters. +config SCSI_LPFC_DEBUG_FS + bool "Emulex LightPulse Fibre Channel debugfs Support" + depends on SCSI_LPFC && DEBUG_FS + help + This makes debugging infomation from the lpfc driver + available via the debugfs filesystem. + config SCSI_SIM710 tristate "Simple 53c710 SCSI support (Compaq, NCR machines)" depends on (EISA || MCA) && SCSI -- cgit v1.2.3-70-g09d2 From 42e9a92fe6a9095bd68a379aaec7ad2be0337f7a Mon Sep 17 00:00:00 2001 From: Robert Love Date: Tue, 9 Dec 2008 15:10:17 -0800 Subject: [SCSI] libfc: A modular Fibre Channel library libFC is composed of 4 blocks supported by an exchange manager and a framing library. The upper 4 layers are fc_lport, fc_disc, fc_rport and fc_fcp. A LLD that uses libfc could choose to either use libfc's block, or using the transport template defined in libfc.h, override one or more blocks with its own implementation. The EM (Exchange Manager) manages exhcanges/sequences for all commands- ELS, CT and FCP. The framing library frames ELS and CT commands. The fc_lport block manages the library's representation of the host's FC enabled ports. The fc_disc block manages discovery of targets as well as handling changes that occur in the FC fabric (via. RSCN events). The fc_rport block manages the library's representation of other entities in the FC fabric. Currently the library uses this block for targets, its peer when in point-to-point mode and the directory server, but can be extended for other entities if needed. The fc_fcp block interacts with the scsi-ml and handles all I/O. Signed-off-by: Robert Love [jejb: added include of delay.h to fix ppc64 compile prob spotted by sfr] Signed-off-by: James Bottomley --- drivers/scsi/Kconfig | 6 + drivers/scsi/Makefile | 1 + drivers/scsi/libfc/Makefile | 12 + drivers/scsi/libfc/fc_disc.c | 845 ++++++++++++++++ drivers/scsi/libfc/fc_elsct.c | 71 ++ drivers/scsi/libfc/fc_exch.c | 1970 +++++++++++++++++++++++++++++++++++++ drivers/scsi/libfc/fc_fcp.c | 2131 +++++++++++++++++++++++++++++++++++++++++ drivers/scsi/libfc/fc_frame.c | 89 ++ drivers/scsi/libfc/fc_lport.c | 1604 +++++++++++++++++++++++++++++++ drivers/scsi/libfc/fc_rport.c | 1291 +++++++++++++++++++++++++ include/scsi/fc_encode.h | 309 ++++++ include/scsi/fc_frame.h | 242 +++++ include/scsi/libfc.h | 938 ++++++++++++++++++ 13 files changed, 9509 insertions(+) create mode 100644 drivers/scsi/libfc/Makefile create mode 100644 drivers/scsi/libfc/fc_disc.c create mode 100644 drivers/scsi/libfc/fc_elsct.c create mode 100644 drivers/scsi/libfc/fc_exch.c create mode 100644 drivers/scsi/libfc/fc_fcp.c create mode 100644 drivers/scsi/libfc/fc_frame.c create mode 100644 drivers/scsi/libfc/fc_lport.c create mode 100644 drivers/scsi/libfc/fc_rport.c create mode 100644 include/scsi/fc_encode.h create mode 100644 include/scsi/fc_frame.h create mode 100644 include/scsi/libfc.h (limited to 'drivers/scsi/Kconfig') diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig index 1badcec18f4..24d762aab7c 100644 --- a/drivers/scsi/Kconfig +++ b/drivers/scsi/Kconfig @@ -603,6 +603,12 @@ config SCSI_FLASHPOINT substantial, so users of MultiMaster Host Adapters may not wish to include it. +config LIBFC + tristate "LibFC module" + depends on SCSI && SCSI_FC_ATTRS + ---help--- + Fibre Channel library module + config SCSI_DMX3191D tristate "DMX3191D SCSI support" depends on PCI && SCSI diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile index b89aedfa9ed..87355f573d6 100644 --- a/drivers/scsi/Makefile +++ b/drivers/scsi/Makefile @@ -36,6 +36,7 @@ obj-$(CONFIG_SCSI_SAS_LIBSAS) += libsas/ obj-$(CONFIG_SCSI_SRP_ATTRS) += scsi_transport_srp.o obj-$(CONFIG_SCSI_DH) += device_handler/ +obj-$(CONFIG_LIBFC) += libfc/ obj-$(CONFIG_ISCSI_TCP) += libiscsi.o libiscsi_tcp.o iscsi_tcp.o obj-$(CONFIG_INFINIBAND_ISER) += libiscsi.o obj-$(CONFIG_SCSI_A4000T) += 53c700.o a4000t.o diff --git a/drivers/scsi/libfc/Makefile b/drivers/scsi/libfc/Makefile new file mode 100644 index 00000000000..55f982de3a9 --- /dev/null +++ b/drivers/scsi/libfc/Makefile @@ -0,0 +1,12 @@ +# $Id: Makefile + +obj-$(CONFIG_LIBFC) += libfc.o + +libfc-objs := \ + fc_disc.o \ + fc_exch.o \ + fc_elsct.o \ + fc_frame.o \ + fc_lport.o \ + fc_rport.o \ + fc_fcp.o diff --git a/drivers/scsi/libfc/fc_disc.c b/drivers/scsi/libfc/fc_disc.c new file mode 100644 index 00000000000..dd1564c9e04 --- /dev/null +++ b/drivers/scsi/libfc/fc_disc.c @@ -0,0 +1,845 @@ +/* + * Copyright(c) 2007 - 2008 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + * + * Maintained at www.Open-FCoE.org + */ + +/* + * Target Discovery + * + * This block discovers all FC-4 remote ports, including FCP initiators. It + * also handles RSCN events and re-discovery if necessary. + */ + +/* + * DISC LOCKING + * + * The disc mutex is can be locked when acquiring rport locks, but may not + * be held when acquiring the lport lock. Refer to fc_lport.c for more + * details. + */ + +#include +#include +#include + +#include + +#include + +#define FC_DISC_RETRY_LIMIT 3 /* max retries */ +#define FC_DISC_RETRY_DELAY 500UL /* (msecs) delay */ + +#define FC_DISC_DELAY 3 + +static int fc_disc_debug; + +#define FC_DEBUG_DISC(fmt...) \ + do { \ + if (fc_disc_debug) \ + FC_DBG(fmt); \ + } while (0) + +static void fc_disc_gpn_ft_req(struct fc_disc *); +static void fc_disc_gpn_ft_resp(struct fc_seq *, struct fc_frame *, void *); +static int fc_disc_new_target(struct fc_disc *, struct fc_rport *, + struct fc_rport_identifiers *); +static void fc_disc_del_target(struct fc_disc *, struct fc_rport *); +static void fc_disc_done(struct fc_disc *); +static void fc_disc_timeout(struct work_struct *); +static void fc_disc_single(struct fc_disc *, struct fc_disc_port *); +static void fc_disc_restart(struct fc_disc *); + +/** + * fc_disc_lookup_rport - lookup a remote port by port_id + * @lport: Fibre Channel host port instance + * @port_id: remote port port_id to match + */ +struct fc_rport *fc_disc_lookup_rport(const struct fc_lport *lport, + u32 port_id) +{ + const struct fc_disc *disc = &lport->disc; + struct fc_rport *rport, *found = NULL; + struct fc_rport_libfc_priv *rdata; + int disc_found = 0; + + list_for_each_entry(rdata, &disc->rports, peers) { + rport = PRIV_TO_RPORT(rdata); + if (rport->port_id == port_id) { + disc_found = 1; + found = rport; + break; + } + } + + if (!disc_found) + found = NULL; + + return found; +} + +/** + * fc_disc_stop_rports - delete all the remote ports associated with the lport + * @disc: The discovery job to stop rports on + * + * Locking Note: This function expects that the lport mutex is locked before + * calling it. + */ +void fc_disc_stop_rports(struct fc_disc *disc) +{ + struct fc_lport *lport; + struct fc_rport *rport; + struct fc_rport_libfc_priv *rdata, *next; + + lport = disc->lport; + + mutex_lock(&disc->disc_mutex); + list_for_each_entry_safe(rdata, next, &disc->rports, peers) { + rport = PRIV_TO_RPORT(rdata); + list_del(&rdata->peers); + lport->tt.rport_logoff(rport); + } + + mutex_unlock(&disc->disc_mutex); +} + +/** + * fc_disc_rport_callback - Event handler for rport events + * @lport: The lport which is receiving the event + * @rport: The rport which the event has occured on + * @event: The event that occured + * + * Locking Note: The rport lock should not be held when calling + * this function. + */ +static void fc_disc_rport_callback(struct fc_lport *lport, + struct fc_rport *rport, + enum fc_rport_event event) +{ + struct fc_rport_libfc_priv *rdata = rport->dd_data; + struct fc_disc *disc = &lport->disc; + int found = 0; + + FC_DEBUG_DISC("Received a %d event for port (%6x)\n", event, + rport->port_id); + + if (event == RPORT_EV_CREATED) { + if (disc) { + found = 1; + mutex_lock(&disc->disc_mutex); + list_add_tail(&rdata->peers, &disc->rports); + mutex_unlock(&disc->disc_mutex); + } + } + + if (!found) + FC_DEBUG_DISC("The rport (%6x) is not maintained " + "by the discovery layer\n", rport->port_id); +} + +/** + * fc_disc_recv_rscn_req - Handle Registered State Change Notification (RSCN) + * @sp: Current sequence of the RSCN exchange + * @fp: RSCN Frame + * @lport: Fibre Channel host port instance + * + * Locking Note: This function expects that the disc_mutex is locked + * before it is called. + */ +static void fc_disc_recv_rscn_req(struct fc_seq *sp, struct fc_frame *fp, + struct fc_disc *disc) +{ + struct fc_lport *lport; + struct fc_rport *rport; + struct fc_rport_libfc_priv *rdata; + struct fc_els_rscn *rp; + struct fc_els_rscn_page *pp; + struct fc_seq_els_data rjt_data; + unsigned int len; + int redisc = 0; + enum fc_els_rscn_ev_qual ev_qual; + enum fc_els_rscn_addr_fmt fmt; + LIST_HEAD(disc_ports); + struct fc_disc_port *dp, *next; + + lport = disc->lport; + + FC_DEBUG_DISC("Received an RSCN event on port (%6x)\n", + fc_host_port_id(lport->host)); + + /* make sure the frame contains an RSCN message */ + rp = fc_frame_payload_get(fp, sizeof(*rp)); + if (!rp) + goto reject; + /* make sure the page length is as expected (4 bytes) */ + if (rp->rscn_page_len != sizeof(*pp)) + goto reject; + /* get the RSCN payload length */ + len = ntohs(rp->rscn_plen); + if (len < sizeof(*rp)) + goto reject; + /* make sure the frame contains the expected payload */ + rp = fc_frame_payload_get(fp, len); + if (!rp) + goto reject; + /* payload must be a multiple of the RSCN page size */ + len -= sizeof(*rp); + if (len % sizeof(*pp)) + goto reject; + + for (pp = (void *)(rp + 1); len > 0; len -= sizeof(*pp), pp++) { + ev_qual = pp->rscn_page_flags >> ELS_RSCN_EV_QUAL_BIT; + ev_qual &= ELS_RSCN_EV_QUAL_MASK; + fmt = pp->rscn_page_flags >> ELS_RSCN_ADDR_FMT_BIT; + fmt &= ELS_RSCN_ADDR_FMT_MASK; + /* + * if we get an address format other than port + * (area, domain, fabric), then do a full discovery + */ + switch (fmt) { + case ELS_ADDR_FMT_PORT: + FC_DEBUG_DISC("Port address format for port (%6x)\n", + ntoh24(pp->rscn_fid)); + dp = kzalloc(sizeof(*dp), GFP_KERNEL); + if (!dp) { + redisc = 1; + break; + } + dp->lp = lport; + dp->ids.port_id = ntoh24(pp->rscn_fid); + dp->ids.port_name = -1; + dp->ids.node_name = -1; + dp->ids.roles = FC_RPORT_ROLE_UNKNOWN; + list_add_tail(&dp->peers, &disc_ports); + break; + case ELS_ADDR_FMT_AREA: + case ELS_ADDR_FMT_DOM: + case ELS_ADDR_FMT_FAB: + default: + FC_DEBUG_DISC("Address format is (%d)\n", fmt); + redisc = 1; + break; + } + } + lport->tt.seq_els_rsp_send(sp, ELS_LS_ACC, NULL); + if (redisc) { + FC_DEBUG_DISC("RSCN received: rediscovering\n"); + fc_disc_restart(disc); + } else { + FC_DEBUG_DISC("RSCN received: not rediscovering. " + "redisc %d state %d in_prog %d\n", + redisc, lport->state, disc->pending); + list_for_each_entry_safe(dp, next, &disc_ports, peers) { + list_del(&dp->peers); + rport = lport->tt.rport_lookup(lport, dp->ids.port_id); + if (rport) { + rdata = RPORT_TO_PRIV(rport); + list_del(&rdata->peers); + lport->tt.rport_logoff(rport); + } + fc_disc_single(disc, dp); + } + } + fc_frame_free(fp); + return; +reject: + FC_DEBUG_DISC("Received a bad RSCN frame\n"); + rjt_data.fp = NULL; + rjt_data.reason = ELS_RJT_LOGIC; + rjt_data.explan = ELS_EXPL_NONE; + lport->tt.seq_els_rsp_send(sp, ELS_LS_RJT, &rjt_data); + fc_frame_free(fp); +} + +/** + * fc_disc_recv_req - Handle incoming requests + * @sp: Current sequence of the request exchange + * @fp: The frame + * @lport: The FC local port + * + * Locking Note: This function is called from the EM and will lock + * the disc_mutex before calling the handler for the + * request. + */ +static void fc_disc_recv_req(struct fc_seq *sp, struct fc_frame *fp, + struct fc_lport *lport) +{ + u8 op; + struct fc_disc *disc = &lport->disc; + + op = fc_frame_payload_op(fp); + switch (op) { + case ELS_RSCN: + mutex_lock(&disc->disc_mutex); + fc_disc_recv_rscn_req(sp, fp, disc); + mutex_unlock(&disc->disc_mutex); + break; + default: + FC_DBG("Received an unsupported request. opcode (%x)\n", op); + break; + } +} + +/** + * fc_disc_restart - Restart discovery + * @lport: FC discovery context + * + * Locking Note: This function expects that the disc mutex + * is already locked. + */ +static void fc_disc_restart(struct fc_disc *disc) +{ + struct fc_rport *rport; + struct fc_rport_libfc_priv *rdata, *next; + struct fc_lport *lport = disc->lport; + + FC_DEBUG_DISC("Restarting discovery for port (%6x)\n", + fc_host_port_id(lport->host)); + + list_for_each_entry_safe(rdata, next, &disc->rports, peers) { + rport = PRIV_TO_RPORT(rdata); + FC_DEBUG_DISC("list_del(%6x)\n", rport->port_id); + list_del(&rdata->peers); + lport->tt.rport_logoff(rport); + } + + disc->requested = 1; + if (!disc->pending) + fc_disc_gpn_ft_req(disc); +} + +/** + * fc_disc_start - Fibre Channel Target discovery + * @lport: FC local port + * + * Returns non-zero if discovery cannot be started. + */ +static void fc_disc_start(void (*disc_callback)(struct fc_lport *, + enum fc_disc_event), + struct fc_lport *lport) +{ + struct fc_rport *rport; + struct fc_rport_identifiers ids; + struct fc_disc *disc = &lport->disc; + + /* + * At this point we may have a new disc job or an existing + * one. Either way, let's lock when we make changes to it + * and send the GPN_FT request. + */ + mutex_lock(&disc->disc_mutex); + + disc->disc_callback = disc_callback; + + /* + * If not ready, or already running discovery, just set request flag. + */ + disc->requested = 1; + + if (disc->pending) { + mutex_unlock(&disc->disc_mutex); + return; + } + + /* + * Handle point-to-point mode as a simple discovery + * of the remote port. Yucky, yucky, yuck, yuck! + */ + rport = disc->lport->ptp_rp; + if (rport) { + ids.port_id = rport->port_id; + ids.port_name = rport->port_name; + ids.node_name = rport->node_name; + ids.roles = FC_RPORT_ROLE_UNKNOWN; + get_device(&rport->dev); + + if (!fc_disc_new_target(disc, rport, &ids)) { + disc->event = DISC_EV_SUCCESS; + fc_disc_done(disc); + } + put_device(&rport->dev); + } else { + fc_disc_gpn_ft_req(disc); /* get ports by FC-4 type */ + } + + mutex_unlock(&disc->disc_mutex); +} + +static struct fc_rport_operations fc_disc_rport_ops = { + .event_callback = fc_disc_rport_callback, +}; + +/** + * fc_disc_new_target - Handle new target found by discovery + * @lport: FC local port + * @rport: The previous FC remote port (NULL if new remote port) + * @ids: Identifiers for the new FC remote port + * + * Locking Note: This function expects that the disc_mutex is locked + * before it is called. + */ +static int fc_disc_new_target(struct fc_disc *disc, + struct fc_rport *rport, + struct fc_rport_identifiers *ids) +{ + struct fc_lport *lport = disc->lport; + struct fc_rport_libfc_priv *rp; + int error = 0; + + if (rport && ids->port_name) { + if (rport->port_name == -1) { + /* + * Set WWN and fall through to notify of create. + */ + fc_rport_set_name(rport, ids->port_name, + rport->node_name); + } else if (rport->port_name != ids->port_name) { + /* + * This is a new port with the same FCID as + * a previously-discovered port. Presumably the old + * port logged out and a new port logged in and was + * assigned the same FCID. This should be rare. + * Delete the old one and fall thru to re-create. + */ + fc_disc_del_target(disc, rport); + rport = NULL; + } + } + if (((ids->port_name != -1) || (ids->port_id != -1)) && + ids->port_id != fc_host_port_id(lport->host) && + ids->port_name != lport->wwpn) { + if (!rport) { + rport = lport->tt.rport_lookup(lport, ids->port_id); + if (!rport) { + struct fc_disc_port dp; + dp.lp = lport; + dp.ids.port_id = ids->port_id; + dp.ids.port_name = ids->port_name; + dp.ids.node_name = ids->node_name; + dp.ids.roles = ids->roles; + rport = fc_rport_rogue_create(&dp); + } + if (!rport) + error = -ENOMEM; + } + if (rport) { + rp = rport->dd_data; + rp->ops = &fc_disc_rport_ops; + rp->rp_state = RPORT_ST_INIT; + lport->tt.rport_login(rport); + } + } + return error; +} + +/** + * fc_disc_del_target - Delete a target + * @disc: FC discovery context + * @rport: The remote port to be removed + */ +static void fc_disc_del_target(struct fc_disc *disc, struct fc_rport *rport) +{ + struct fc_lport *lport = disc->lport; + struct fc_rport_libfc_priv *rdata = RPORT_TO_PRIV(rport); + list_del(&rdata->peers); + lport->tt.rport_logoff(rport); +} + +/** + * fc_disc_done - Discovery has been completed + * @disc: FC discovery context + */ +static void fc_disc_done(struct fc_disc *disc) +{ + struct fc_lport *lport = disc->lport; + + FC_DEBUG_DISC("Discovery complete for port (%6x)\n", + fc_host_port_id(lport->host)); + + disc->disc_callback(lport, disc->event); + disc->event = DISC_EV_NONE; + + if (disc->requested) + fc_disc_gpn_ft_req(disc); + else + disc->pending = 0; +} + +/** + * fc_disc_error - Handle error on dNS request + * @disc: FC discovery context + * @fp: The frame pointer + */ +static void fc_disc_error(struct fc_disc *disc, struct fc_frame *fp) +{ + struct fc_lport *lport = disc->lport; + unsigned long delay = 0; + if (fc_disc_debug) + FC_DBG("Error %ld, retries %d/%d\n", + PTR_ERR(fp), disc->retry_count, + FC_DISC_RETRY_LIMIT); + + if (!fp || PTR_ERR(fp) == -FC_EX_TIMEOUT) { + /* + * Memory allocation failure, or the exchange timed out, + * retry after delay. + */ + if (disc->retry_count < FC_DISC_RETRY_LIMIT) { + /* go ahead and retry */ + if (!fp) + delay = msecs_to_jiffies(FC_DISC_RETRY_DELAY); + else { + delay = msecs_to_jiffies(lport->e_d_tov); + + /* timeout faster first time */ + if (!disc->retry_count) + delay /= 4; + } + disc->retry_count++; + schedule_delayed_work(&disc->disc_work, delay); + } else { + /* exceeded retries */ + disc->event = DISC_EV_FAILED; + fc_disc_done(disc); + } + } +} + +/** + * fc_disc_gpn_ft_req - Send Get Port Names by FC-4 type (GPN_FT) request + * @lport: FC discovery context + * + * Locking Note: This function expects that the disc_mutex is locked + * before it is called. + */ +static void fc_disc_gpn_ft_req(struct fc_disc *disc) +{ + struct fc_frame *fp; + struct fc_lport *lport = disc->lport; + + WARN_ON(!fc_lport_test_ready(lport)); + + disc->pending = 1; + disc->requested = 0; + + disc->buf_len = 0; + disc->seq_count = 0; + fp = fc_frame_alloc(lport, + sizeof(struct fc_ct_hdr) + + sizeof(struct fc_ns_gid_ft)); + if (!fp) + goto err; + + if (lport->tt.elsct_send(lport, NULL, fp, + FC_NS_GPN_FT, + fc_disc_gpn_ft_resp, + disc, lport->e_d_tov)) + return; +err: + fc_disc_error(disc, fp); +} + +/** + * fc_disc_gpn_ft_parse - Parse the list of IDs and names resulting from a request + * @lport: Fibre Channel host port instance + * @buf: GPN_FT response buffer + * @len: size of response buffer + */ +static int fc_disc_gpn_ft_parse(struct fc_disc *disc, void *buf, size_t len) +{ + struct fc_lport *lport; + struct fc_gpn_ft_resp *np; + char *bp; + size_t plen; + size_t tlen; + int error = 0; + struct fc_disc_port dp; + struct fc_rport *rport; + struct fc_rport_libfc_priv *rdata; + + lport = disc->lport; + + /* + * Handle partial name record left over from previous call. + */ + bp = buf; + plen = len; + np = (struct fc_gpn_ft_resp *)bp; + tlen = disc->buf_len; + if (tlen) { + WARN_ON(tlen >= sizeof(*np)); + plen = sizeof(*np) - tlen; + WARN_ON(plen <= 0); + WARN_ON(plen >= sizeof(*np)); + if (plen > len) + plen = len; + np = &disc->partial_buf; + memcpy((char *)np + tlen, bp, plen); + + /* + * Set bp so that the loop below will advance it to the + * first valid full name element. + */ + bp -= tlen; + len += tlen; + plen += tlen; + disc->buf_len = (unsigned char) plen; + if (plen == sizeof(*np)) + disc->buf_len = 0; + } + + /* + * Handle full name records, including the one filled from above. + * Normally, np == bp and plen == len, but from the partial case above, + * bp, len describe the overall buffer, and np, plen describe the + * partial buffer, which if would usually be full now. + * After the first time through the loop, things return to "normal". + */ + while (plen >= sizeof(*np)) { + dp.lp = lport; + dp.ids.port_id = ntoh24(np->fp_fid); + dp.ids.port_name = ntohll(np->fp_wwpn); + dp.ids.node_name = -1; + dp.ids.roles = FC_RPORT_ROLE_UNKNOWN; + + if ((dp.ids.port_id != fc_host_port_id(lport->host)) && + (dp.ids.port_name != lport->wwpn)) { + rport = fc_rport_rogue_create(&dp); + if (rport) { + rdata = rport->dd_data; + rdata->ops = &fc_disc_rport_ops; + rdata->local_port = lport; + lport->tt.rport_login(rport); + } else + FC_DBG("Failed to allocate memory for " + "the newly discovered port (%6x)\n", + dp.ids.port_id); + } + + if (np->fp_flags & FC_NS_FID_LAST) { + disc->event = DISC_EV_SUCCESS; + fc_disc_done(disc); + len = 0; + break; + } + len -= sizeof(*np); + bp += sizeof(*np); + np = (struct fc_gpn_ft_resp *)bp; + plen = len; + } + + /* + * Save any partial record at the end of the buffer for next time. + */ + if (error == 0 && len > 0 && len < sizeof(*np)) { + if (np != &disc->partial_buf) { + FC_DEBUG_DISC("Partial buffer remains " + "for discovery by (%6x)\n", + fc_host_port_id(lport->host)); + memcpy(&disc->partial_buf, np, len); + } + disc->buf_len = (unsigned char) len; + } else { + disc->buf_len = 0; + } + return error; +} + +/* + * Handle retry of memory allocation for remote ports. + */ +static void fc_disc_timeout(struct work_struct *work) +{ + struct fc_disc *disc = container_of(work, + struct fc_disc, + disc_work.work); + mutex_lock(&disc->disc_mutex); + if (disc->requested && !disc->pending) + fc_disc_gpn_ft_req(disc); + mutex_unlock(&disc->disc_mutex); +} + +/** + * fc_disc_gpn_ft_resp - Handle a response frame from Get Port Names (GPN_FT) + * @sp: Current sequence of GPN_FT exchange + * @fp: response frame + * @lp_arg: Fibre Channel host port instance + * + * Locking Note: This function expects that the disc_mutex is locked + * before it is called. + */ +static void fc_disc_gpn_ft_resp(struct fc_seq *sp, struct fc_frame *fp, + void *disc_arg) +{ + struct fc_disc *disc = disc_arg; + struct fc_ct_hdr *cp; + struct fc_frame_header *fh; + unsigned int seq_cnt; + void *buf = NULL; + unsigned int len; + int error; + + FC_DEBUG_DISC("Received a GPN_FT response on port (%6x)\n", + fc_host_port_id(disc->lport->host)); + + if (IS_ERR(fp)) { + fc_disc_error(disc, fp); + return; + } + + WARN_ON(!fc_frame_is_linear(fp)); /* buffer must be contiguous */ + fh = fc_frame_header_get(fp); + len = fr_len(fp) - sizeof(*fh); + seq_cnt = ntohs(fh->fh_seq_cnt); + if (fr_sof(fp) == FC_SOF_I3 && seq_cnt == 0 && + disc->seq_count == 0) { + cp = fc_frame_payload_get(fp, sizeof(*cp)); + if (!cp) { + FC_DBG("GPN_FT response too short, len %d\n", + fr_len(fp)); + } else if (ntohs(cp->ct_cmd) == FC_FS_ACC) { + + /* + * Accepted. Parse response. + */ + buf = cp + 1; + len -= sizeof(*cp); + } else if (ntohs(cp->ct_cmd) == FC_FS_RJT) { + FC_DBG("GPN_FT rejected reason %x exp %x " + "(check zoning)\n", cp->ct_reason, + cp->ct_explan); + disc->event = DISC_EV_FAILED; + fc_disc_done(disc); + } else { + FC_DBG("GPN_FT unexpected response code %x\n", + ntohs(cp->ct_cmd)); + } + } else if (fr_sof(fp) == FC_SOF_N3 && + seq_cnt == disc->seq_count) { + buf = fh + 1; + } else { + FC_DBG("GPN_FT unexpected frame - out of sequence? " + "seq_cnt %x expected %x sof %x eof %x\n", + seq_cnt, disc->seq_count, fr_sof(fp), fr_eof(fp)); + } + if (buf) { + error = fc_disc_gpn_ft_parse(disc, buf, len); + if (error) + fc_disc_error(disc, fp); + else + disc->seq_count++; + } + fc_frame_free(fp); +} + +/** + * fc_disc_single - Discover the directory information for a single target + * @lport: FC local port + * @dp: The port to rediscover + * + * Locking Note: This function expects that the disc_mutex is locked + * before it is called. + */ +static void fc_disc_single(struct fc_disc *disc, struct fc_disc_port *dp) +{ + struct fc_lport *lport; + struct fc_rport *rport; + struct fc_rport *new_rport; + struct fc_rport_libfc_priv *rdata; + + lport = disc->lport; + + if (dp->ids.port_id == fc_host_port_id(lport->host)) + goto out; + + rport = lport->tt.rport_lookup(lport, dp->ids.port_id); + if (rport) + fc_disc_del_target(disc, rport); + + new_rport = fc_rport_rogue_create(dp); + if (new_rport) { + rdata = new_rport->dd_data; + rdata->ops = &fc_disc_rport_ops; + kfree(dp); + lport->tt.rport_login(new_rport); + } + return; +out: + kfree(dp); +} + +/** + * fc_disc_stop - Stop discovery for a given lport + * @lport: The lport that discovery should stop for + */ +void fc_disc_stop(struct fc_lport *lport) +{ + struct fc_disc *disc = &lport->disc; + + if (disc) { + cancel_delayed_work_sync(&disc->disc_work); + fc_disc_stop_rports(disc); + } +} + +/** + * fc_disc_stop_final - Stop discovery for a given lport + * @lport: The lport that discovery should stop for + * + * This function will block until discovery has been + * completely stopped and all rports have been deleted. + */ +void fc_disc_stop_final(struct fc_lport *lport) +{ + fc_disc_stop(lport); + lport->tt.rport_flush_queue(); +} + +/** + * fc_disc_init - Initialize the discovery block + * @lport: FC local port + */ +int fc_disc_init(struct fc_lport *lport) +{ + struct fc_disc *disc; + + if (!lport->tt.disc_start) + lport->tt.disc_start = fc_disc_start; + + if (!lport->tt.disc_stop) + lport->tt.disc_stop = fc_disc_stop; + + if (!lport->tt.disc_stop_final) + lport->tt.disc_stop_final = fc_disc_stop_final; + + if (!lport->tt.disc_recv_req) + lport->tt.disc_recv_req = fc_disc_recv_req; + + if (!lport->tt.rport_lookup) + lport->tt.rport_lookup = fc_disc_lookup_rport; + + disc = &lport->disc; + INIT_DELAYED_WORK(&disc->disc_work, fc_disc_timeout); + mutex_init(&disc->disc_mutex); + INIT_LIST_HEAD(&disc->rports); + + disc->lport = lport; + disc->delay = FC_DISC_DELAY; + disc->event = DISC_EV_NONE; + + return 0; +} +EXPORT_SYMBOL(fc_disc_init); diff --git a/drivers/scsi/libfc/fc_elsct.c b/drivers/scsi/libfc/fc_elsct.c new file mode 100644 index 00000000000..dd47fe619d1 --- /dev/null +++ b/drivers/scsi/libfc/fc_elsct.c @@ -0,0 +1,71 @@ +/* + * Copyright(c) 2008 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + * + * Maintained at www.Open-FCoE.org + */ + +/* + * Provide interface to send ELS/CT FC frames + */ + +#include +#include +#include +#include +#include +#include + +/* + * fc_elsct_send - sends ELS/CT frame + */ +static struct fc_seq *fc_elsct_send(struct fc_lport *lport, + struct fc_rport *rport, + struct fc_frame *fp, + unsigned int op, + void (*resp)(struct fc_seq *, + struct fc_frame *fp, + void *arg), + void *arg, u32 timer_msec) +{ + enum fc_rctl r_ctl; + u32 did; + enum fc_fh_type fh_type; + int rc; + + /* ELS requests */ + if ((op >= ELS_LS_RJT) && (op <= ELS_AUTH_ELS)) + rc = fc_els_fill(lport, rport, fp, op, &r_ctl, &did, &fh_type); + else + /* CT requests */ + rc = fc_ct_fill(lport, fp, op, &r_ctl, &did, &fh_type); + + if (rc) + return NULL; + + fc_fill_fc_hdr(fp, r_ctl, did, fc_host_port_id(lport->host), fh_type, + FC_FC_FIRST_SEQ | FC_FC_END_SEQ | FC_FC_SEQ_INIT, 0); + + return lport->tt.exch_seq_send(lport, fp, resp, NULL, arg, timer_msec); +} + +int fc_elsct_init(struct fc_lport *lport) +{ + if (!lport->tt.elsct_send) + lport->tt.elsct_send = fc_elsct_send; + + return 0; +} +EXPORT_SYMBOL(fc_elsct_init); diff --git a/drivers/scsi/libfc/fc_exch.c b/drivers/scsi/libfc/fc_exch.c new file mode 100644 index 00000000000..66db08a5f27 --- /dev/null +++ b/drivers/scsi/libfc/fc_exch.c @@ -0,0 +1,1970 @@ +/* + * Copyright(c) 2007 Intel Corporation. All rights reserved. + * Copyright(c) 2008 Red Hat, Inc. All rights reserved. + * Copyright(c) 2008 Mike Christie + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + * + * Maintained at www.Open-FCoE.org + */ + +/* + * Fibre Channel exchange and sequence handling. + */ + +#include +#include +#include + +#include + +#include +#include + +#define FC_DEF_R_A_TOV (10 * 1000) /* resource allocation timeout */ + +/* + * fc_exch_debug can be set in debugger or at compile time to get more logs. + */ +static int fc_exch_debug; + +#define FC_DEBUG_EXCH(fmt...) \ + do { \ + if (fc_exch_debug) \ + FC_DBG(fmt); \ + } while (0) + +static struct kmem_cache *fc_em_cachep; /* cache for exchanges */ + +/* + * Structure and function definitions for managing Fibre Channel Exchanges + * and Sequences. + * + * The three primary structures used here are fc_exch_mgr, fc_exch, and fc_seq. + * + * fc_exch_mgr holds the exchange state for an N port + * + * fc_exch holds state for one exchange and links to its active sequence. + * + * fc_seq holds the state for an individual sequence. + */ + +/* + * Exchange manager. + * + * This structure is the center for creating exchanges and sequences. + * It manages the allocation of exchange IDs. + */ +struct fc_exch_mgr { + enum fc_class class; /* default class for sequences */ + spinlock_t em_lock; /* exchange manager lock, + must be taken before ex_lock */ + u16 last_xid; /* last allocated exchange ID */ + u16 min_xid; /* min exchange ID */ + u16 max_xid; /* max exchange ID */ + u16 max_read; /* max exchange ID for read */ + u16 last_read; /* last xid allocated for read */ + u32 total_exches; /* total allocated exchanges */ + struct list_head ex_list; /* allocated exchanges list */ + struct fc_lport *lp; /* fc device instance */ + mempool_t *ep_pool; /* reserve ep's */ + + /* + * currently exchange mgr stats are updated but not used. + * either stats can be expose via sysfs or remove them + * all together if not used XXX + */ + struct { + atomic_t no_free_exch; + atomic_t no_free_exch_xid; + atomic_t xid_not_found; + atomic_t xid_busy; + atomic_t seq_not_found; + atomic_t non_bls_resp; + } stats; + struct fc_exch **exches; /* for exch pointers indexed by xid */ +}; +#define fc_seq_exch(sp) container_of(sp, struct fc_exch, seq) + +static void fc_exch_rrq(struct fc_exch *); +static void fc_seq_ls_acc(struct fc_seq *); +static void fc_seq_ls_rjt(struct fc_seq *, enum fc_els_rjt_reason, + enum fc_els_rjt_explan); +static void fc_exch_els_rec(struct fc_seq *, struct fc_frame *); +static void fc_exch_els_rrq(struct fc_seq *, struct fc_frame *); +static struct fc_seq *fc_seq_start_next_locked(struct fc_seq *sp); + +/* + * Internal implementation notes. + * + * The exchange manager is one by default in libfc but LLD may choose + * to have one per CPU. The sequence manager is one per exchange manager + * and currently never separated. + * + * Section 9.8 in FC-FS-2 specifies: "The SEQ_ID is a one-byte field + * assigned by the Sequence Initiator that shall be unique for a specific + * D_ID and S_ID pair while the Sequence is open." Note that it isn't + * qualified by exchange ID, which one might think it would be. + * In practice this limits the number of open sequences and exchanges to 256 + * per session. For most targets we could treat this limit as per exchange. + * + * The exchange and its sequence are freed when the last sequence is received. + * It's possible for the remote port to leave an exchange open without + * sending any sequences. + * + * Notes on reference counts: + * + * Exchanges are reference counted and exchange gets freed when the reference + * count becomes zero. + * + * Timeouts: + * Sequences are timed out for E_D_TOV and R_A_TOV. + * + * Sequence event handling: + * + * The following events may occur on initiator sequences: + * + * Send. + * For now, the whole thing is sent. + * Receive ACK + * This applies only to class F. + * The sequence is marked complete. + * ULP completion. + * The upper layer calls fc_exch_done() when done + * with exchange and sequence tuple. + * RX-inferred completion. + * When we receive the next sequence on the same exchange, we can + * retire the previous sequence ID. (XXX not implemented). + * Timeout. + * R_A_TOV frees the sequence ID. If we're waiting for ACK, + * E_D_TOV causes abort and calls upper layer response handler + * with FC_EX_TIMEOUT error. + * Receive RJT + * XXX defer. + * Send ABTS + * On timeout. + * + * The following events may occur on recipient sequences: + * + * Receive + * Allocate sequence for first frame received. + * Hold during receive handler. + * Release when final frame received. + * Keep status of last N of these for the ELS RES command. XXX TBD. + * Receive ABTS + * Deallocate sequence + * Send RJT + * Deallocate + * + * For now, we neglect conditions where only part of a sequence was + * received or transmitted, or where out-of-order receipt is detected. + */ + +/* + * Locking notes: + * + * The EM code run in a per-CPU worker thread. + * + * To protect against concurrency between a worker thread code and timers, + * sequence allocation and deallocation must be locked. + * - exchange refcnt can be done atomicly without locks. + * - sequence allocation must be locked by exch lock. + * - If the em_lock and ex_lock must be taken at the same time, then the + * em_lock must be taken before the ex_lock. + */ + +/* + * opcode names for debugging. + */ +static char *fc_exch_rctl_names[] = FC_RCTL_NAMES_INIT; + +#define FC_TABLE_SIZE(x) (sizeof(x) / sizeof(x[0])) + +static inline const char *fc_exch_name_lookup(unsigned int op, char **table, + unsigned int max_index) +{ + const char *name = NULL; + + if (op < max_index) + name = table[op]; + if (!name) + name = "unknown"; + return name; +} + +static const char *fc_exch_rctl_name(unsigned int op) +{ + return fc_exch_name_lookup(op, fc_exch_rctl_names, + FC_TABLE_SIZE(fc_exch_rctl_names)); +} + +/* + * Hold an exchange - keep it from being freed. + */ +static void fc_exch_hold(struct fc_exch *ep) +{ + atomic_inc(&ep->ex_refcnt); +} + +/* + * setup fc hdr by initializing few more FC header fields and sof/eof. + * Initialized fields by this func: + * - fh_ox_id, fh_rx_id, fh_seq_id, fh_seq_cnt + * - sof and eof + */ +static void fc_exch_setup_hdr(struct fc_exch *ep, struct fc_frame *fp, + u32 f_ctl) +{ + struct fc_frame_header *fh = fc_frame_header_get(fp); + u16 fill; + + fr_sof(fp) = ep->class; + if (ep->seq.cnt) + fr_sof(fp) = fc_sof_normal(ep->class); + + if (f_ctl & FC_FC_END_SEQ) { + fr_eof(fp) = FC_EOF_T; + if (fc_sof_needs_ack(ep->class)) + fr_eof(fp) = FC_EOF_N; + /* + * Form f_ctl. + * The number of fill bytes to make the length a 4-byte + * multiple is the low order 2-bits of the f_ctl. + * The fill itself will have been cleared by the frame + * allocation. + * After this, the length will be even, as expected by + * the transport. + */ + fill = fr_len(fp) & 3; + if (fill) { + fill = 4 - fill; + /* TODO, this may be a problem with fragmented skb */ + skb_put(fp_skb(fp), fill); + hton24(fh->fh_f_ctl, f_ctl | fill); + } + } else { + WARN_ON(fr_len(fp) % 4 != 0); /* no pad to non last frame */ + fr_eof(fp) = FC_EOF_N; + } + + /* + * Initialize remainig fh fields + * from fc_fill_fc_hdr + */ + fh->fh_ox_id = htons(ep->oxid); + fh->fh_rx_id = htons(ep->rxid); + fh->fh_seq_id = ep->seq.id; + fh->fh_seq_cnt = htons(ep->seq.cnt); +} + + +/* + * Release a reference to an exchange. + * If the refcnt goes to zero and the exchange is complete, it is freed. + */ +static void fc_exch_release(struct fc_exch *ep) +{ + struct fc_exch_mgr *mp; + + if (atomic_dec_and_test(&ep->ex_refcnt)) { + mp = ep->em; + if (ep->destructor) + ep->destructor(&ep->seq, ep->arg); + if (ep->lp->tt.exch_put) + ep->lp->tt.exch_put(ep->lp, mp, ep->xid); + WARN_ON(!ep->esb_stat & ESB_ST_COMPLETE); + mempool_free(ep, mp->ep_pool); + } +} + +static int fc_exch_done_locked(struct fc_exch *ep) +{ + int rc = 1; + + /* + * We must check for completion in case there are two threads + * tyring to complete this. But the rrq code will reuse the + * ep, and in that case we only clear the resp and set it as + * complete, so it can be reused by the timer to send the rrq. + */ + ep->resp = NULL; + if (ep->state & FC_EX_DONE) + return rc; + ep->esb_stat |= ESB_ST_COMPLETE; + + if (!(ep->esb_stat & ESB_ST_REC_QUAL)) { + ep->state |= FC_EX_DONE; + if (cancel_delayed_work(&ep->timeout_work)) + atomic_dec(&ep->ex_refcnt); /* drop hold for timer */ + rc = 0; + } + return rc; +} + +static void fc_exch_mgr_delete_ep(struct fc_exch *ep) +{ + struct fc_exch_mgr *mp; + + mp = ep->em; + spin_lock_bh(&mp->em_lock); + WARN_ON(mp->total_exches <= 0); + mp->total_exches--; + mp->exches[ep->xid - mp->min_xid] = NULL; + list_del(&ep->ex_list); + spin_unlock_bh(&mp->em_lock); + fc_exch_release(ep); /* drop hold for exch in mp */ +} + +/* + * Internal version of fc_exch_timer_set - used with lock held. + */ +static inline void fc_exch_timer_set_locked(struct fc_exch *ep, + unsigned int timer_msec) +{ + if (ep->state & (FC_EX_RST_CLEANUP | FC_EX_DONE)) + return; + + FC_DEBUG_EXCH("Exchange (%4x) timed out, notifying the upper layer\n", + ep->xid); + if (schedule_delayed_work(&ep->timeout_work, + msecs_to_jiffies(timer_msec))) + fc_exch_hold(ep); /* hold for timer */ +} + +/* + * Set timer for an exchange. + * The time is a minimum delay in milliseconds until the timer fires. + * Used for upper level protocols to time out the exchange. + * The timer is cancelled when it fires or when the exchange completes. + * Returns non-zero if a timer couldn't be allocated. + */ +static void fc_exch_timer_set(struct fc_exch *ep, unsigned int timer_msec) +{ + spin_lock_bh(&ep->ex_lock); + fc_exch_timer_set_locked(ep, timer_msec); + spin_unlock_bh(&ep->ex_lock); +} + +int fc_seq_exch_abort(const struct fc_seq *req_sp, unsigned int timer_msec) +{ + struct fc_seq *sp; + struct fc_exch *ep; + struct fc_frame *fp; + int error; + + ep = fc_seq_exch(req_sp); + + spin_lock_bh(&ep->ex_lock); + if (ep->esb_stat & (ESB_ST_COMPLETE | ESB_ST_ABNORMAL) || + ep->state & (FC_EX_DONE | FC_EX_RST_CLEANUP)) { + spin_unlock_bh(&ep->ex_lock); + return -ENXIO; + } + + /* + * Send the abort on a new sequence if possible. + */ + sp = fc_seq_start_next_locked(&ep->seq); + if (!sp) { + spin_unlock_bh(&ep->ex_lock); + return -ENOMEM; + } + + ep->esb_stat |= ESB_ST_SEQ_INIT | ESB_ST_ABNORMAL; + if (timer_msec) + fc_exch_timer_set_locked(ep, timer_msec); + spin_unlock_bh(&ep->ex_lock); + + /* + * If not logged into the fabric, don't send ABTS but leave + * sequence active until next timeout. + */ + if (!ep->sid) + return 0; + + /* + * Send an abort for the sequence that timed out. + */ + fp = fc_frame_alloc(ep->lp, 0); + if (fp) { + fc_fill_fc_hdr(fp, FC_RCTL_BA_ABTS, ep->did, ep->sid, + FC_TYPE_BLS, FC_FC_END_SEQ | FC_FC_SEQ_INIT, 0); + error = fc_seq_send(ep->lp, sp, fp); + } else + error = -ENOBUFS; + return error; +} +EXPORT_SYMBOL(fc_seq_exch_abort); + +/* + * Exchange timeout - handle exchange timer expiration. + * The timer will have been cancelled before this is called. + */ +static void fc_exch_timeout(struct work_struct *work) +{ + struct fc_exch *ep = container_of(work, struct fc_exch, + timeout_work.work); + struct fc_seq *sp = &ep->seq; + void (*resp)(struct fc_seq *, struct fc_frame *fp, void *arg); + void *arg; + u32 e_stat; + int rc = 1; + + spin_lock_bh(&ep->ex_lock); + if (ep->state & (FC_EX_RST_CLEANUP | FC_EX_DONE)) + goto unlock; + + e_stat = ep->esb_stat; + if (e_stat & ESB_ST_COMPLETE) { + ep->esb_stat = e_stat & ~ESB_ST_REC_QUAL; + if (e_stat & ESB_ST_REC_QUAL) + fc_exch_rrq(ep); + spin_unlock_bh(&ep->ex_lock); + goto done; + } else { + resp = ep->resp; + arg = ep->arg; + ep->resp = NULL; + if (e_stat & ESB_ST_ABNORMAL) + rc = fc_exch_done_locked(ep); + spin_unlock_bh(&ep->ex_lock); + if (!rc) + fc_exch_mgr_delete_ep(ep); + if (resp) + resp(sp, ERR_PTR(-FC_EX_TIMEOUT), arg); + fc_seq_exch_abort(sp, 2 * ep->r_a_tov); + goto done; + } +unlock: + spin_unlock_bh(&ep->ex_lock); +done: + /* + * This release matches the hold taken when the timer was set. + */ + fc_exch_release(ep); +} + +/* + * Allocate a sequence. + * + * We don't support multiple originated sequences on the same exchange. + * By implication, any previously originated sequence on this exchange + * is complete, and we reallocate the same sequence. + */ +static struct fc_seq *fc_seq_alloc(struct fc_exch *ep, u8 seq_id) +{ + struct fc_seq *sp; + + sp = &ep->seq; + sp->ssb_stat = 0; + sp->cnt = 0; + sp->id = seq_id; + return sp; +} + +/* + * fc_em_alloc_xid - returns an xid based on request type + * @lp : ptr to associated lport + * @fp : ptr to the assocated frame + * + * check the associated fc_fsp_pkt to get scsi command type and + * command direction to decide from which range this exch id + * will be allocated from. + * + * Returns : 0 or an valid xid + */ +static u16 fc_em_alloc_xid(struct fc_exch_mgr *mp, const struct fc_frame *fp) +{ + u16 xid, min, max; + u16 *plast; + struct fc_exch *ep = NULL; + + if (mp->max_read) { + if (fc_frame_is_read(fp)) { + min = mp->min_xid; + max = mp->max_read; + plast = &mp->last_read; + } else { + min = mp->max_read + 1; + max = mp->max_xid; + plast = &mp->last_xid; + } + } else { + min = mp->min_xid; + max = mp->max_xid; + plast = &mp->last_xid; + } + xid = *plast; + do { + xid = (xid == max) ? min : xid + 1; + ep = mp->exches[xid - mp->min_xid]; + } while ((ep != NULL) && (xid != *plast)); + + if (unlikely(ep)) + xid = 0; + else + *plast = xid; + + return xid; +} + +/* + * fc_exch_alloc - allocate an exchange. + * @mp : ptr to the exchange manager + * @xid: input xid + * + * if xid is supplied zero then assign next free exchange ID + * from exchange manager, otherwise use supplied xid. + * Returns with exch lock held. + */ +struct fc_exch *fc_exch_alloc(struct fc_exch_mgr *mp, + struct fc_frame *fp, u16 xid) +{ + struct fc_exch *ep; + + /* allocate memory for exchange */ + ep = mempool_alloc(mp->ep_pool, GFP_ATOMIC); + if (!ep) { + atomic_inc(&mp->stats.no_free_exch); + goto out; + } + memset(ep, 0, sizeof(*ep)); + + spin_lock_bh(&mp->em_lock); + /* alloc xid if input xid 0 */ + if (!xid) { + /* alloc a new xid */ + xid = fc_em_alloc_xid(mp, fp); + if (!xid) { + printk(KERN_ERR "fc_em_alloc_xid() failed\n"); + goto err; + } + } + + fc_exch_hold(ep); /* hold for exch in mp */ + spin_lock_init(&ep->ex_lock); + /* + * Hold exch lock for caller to prevent fc_exch_reset() + * from releasing exch while fc_exch_alloc() caller is + * still working on exch. + */ + spin_lock_bh(&ep->ex_lock); + + mp->exches[xid - mp->min_xid] = ep; + list_add_tail(&ep->ex_list, &mp->ex_list); + fc_seq_alloc(ep, ep->seq_id++); + mp->total_exches++; + spin_unlock_bh(&mp->em_lock); + + /* + * update exchange + */ + ep->oxid = ep->xid = xid; + ep->em = mp; + ep->lp = mp->lp; + ep->f_ctl = FC_FC_FIRST_SEQ; /* next seq is first seq */ + ep->rxid = FC_XID_UNKNOWN; + ep->class = mp->class; + INIT_DELAYED_WORK(&ep->timeout_work, fc_exch_timeout); +out: + return ep; +err: + spin_unlock_bh(&mp->em_lock); + atomic_inc(&mp->stats.no_free_exch_xid); + mempool_free(ep, mp->ep_pool); + return NULL; +} +EXPORT_SYMBOL(fc_exch_alloc); + +/* + * Lookup and hold an exchange. + */ +static struct fc_exch *fc_exch_find(struct fc_exch_mgr *mp, u16 xid) +{ + struct fc_exch *ep = NULL; + + if ((xid >= mp->min_xid) && (xid <= mp->max_xid)) { + spin_lock_bh(&mp->em_lock); + ep = mp->exches[xid - mp->min_xid]; + if (ep) { + fc_exch_hold(ep); + WARN_ON(ep->xid != xid); + } + spin_unlock_bh(&mp->em_lock); + } + return ep; +} + +void fc_exch_done(struct fc_seq *sp) +{ + struct fc_exch *ep = fc_seq_exch(sp); + int rc; + + spin_lock_bh(&ep->ex_lock); + rc = fc_exch_done_locked(ep); + spin_unlock_bh(&ep->ex_lock); + if (!rc) + fc_exch_mgr_delete_ep(ep); +} +EXPORT_SYMBOL(fc_exch_done); + +/* + * Allocate a new exchange as responder. + * Sets the responder ID in the frame header. + */ +static struct fc_exch *fc_exch_resp(struct fc_exch_mgr *mp, struct fc_frame *fp) +{ + struct fc_exch *ep; + struct fc_frame_header *fh; + u16 rxid; + + ep = mp->lp->tt.exch_get(mp->lp, fp); + if (ep) { + ep->class = fc_frame_class(fp); + + /* + * Set EX_CTX indicating we're responding on this exchange. + */ + ep->f_ctl |= FC_FC_EX_CTX; /* we're responding */ + ep->f_ctl &= ~FC_FC_FIRST_SEQ; /* not new */ + fh = fc_frame_header_get(fp); + ep->sid = ntoh24(fh->fh_d_id); + ep->did = ntoh24(fh->fh_s_id); + ep->oid = ep->did; + + /* + * Allocated exchange has placed the XID in the + * originator field. Move it to the responder field, + * and set the originator XID from the frame. + */ + ep->rxid = ep->xid; + ep->oxid = ntohs(fh->fh_ox_id); + ep->esb_stat |= ESB_ST_RESP | ESB_ST_SEQ_INIT; + if ((ntoh24(fh->fh_f_ctl) & FC_FC_SEQ_INIT) == 0) + ep->esb_stat &= ~ESB_ST_SEQ_INIT; + + /* + * Set the responder ID in the frame header. + * The old one should've been 0xffff. + * If it isn't, don't assign one. + * Incoming basic link service frames may specify + * a referenced RX_ID. + */ + if (fh->fh_type != FC_TYPE_BLS) { + rxid = ntohs(fh->fh_rx_id); + WARN_ON(rxid != FC_XID_UNKNOWN); + fh->fh_rx_id = htons(ep->rxid); + } + fc_exch_hold(ep); /* hold for caller */ + spin_unlock_bh(&ep->ex_lock); /* lock from exch_get */ + } + return ep; +} + +/* + * Find a sequence for receive where the other end is originating the sequence. + * If fc_pf_rjt_reason is FC_RJT_NONE then this function will have a hold + * on the ep that should be released by the caller. + */ +static enum fc_pf_rjt_reason +fc_seq_lookup_recip(struct fc_exch_mgr *mp, struct fc_frame *fp) +{ + struct fc_frame_header *fh = fc_frame_header_get(fp); + struct fc_exch *ep = NULL; + struct fc_seq *sp = NULL; + enum fc_pf_rjt_reason reject = FC_RJT_NONE; + u32 f_ctl; + u16 xid; + + f_ctl = ntoh24(fh->fh_f_ctl); + WARN_ON((f_ctl & FC_FC_SEQ_CTX) != 0); + + /* + * Lookup or create the exchange if we will be creating the sequence. + */ + if (f_ctl & FC_FC_EX_CTX) { + xid = ntohs(fh->fh_ox_id); /* we originated exch */ + ep = fc_exch_find(mp, xid); + if (!ep) { + atomic_inc(&mp->stats.xid_not_found); + reject = FC_RJT_OX_ID; + goto out; + } + if (ep->rxid == FC_XID_UNKNOWN) + ep->rxid = ntohs(fh->fh_rx_id); + else if (ep->rxid != ntohs(fh->fh_rx_id)) { + reject = FC_RJT_OX_ID; + goto rel; + } + } else { + xid = ntohs(fh->fh_rx_id); /* we are the responder */ + + /* + * Special case for MDS issuing an ELS TEST with a + * bad rxid of 0. + * XXX take this out once we do the proper reject. + */ + if (xid == 0 && fh->fh_r_ctl == FC_RCTL_ELS_REQ && + fc_frame_payload_op(fp) == ELS_TEST) { + fh->fh_rx_id = htons(FC_XID_UNKNOWN); + xid = FC_XID_UNKNOWN; + } + + /* + * new sequence - find the exchange + */ + ep = fc_exch_find(mp, xid); + if ((f_ctl & FC_FC_FIRST_SEQ) && fc_sof_is_init(fr_sof(fp))) { + if (ep) { + atomic_inc(&mp->stats.xid_busy); + reject = FC_RJT_RX_ID; + goto rel; + } + ep = fc_exch_resp(mp, fp); + if (!ep) { + reject = FC_RJT_EXCH_EST; /* XXX */ + goto out; + } + xid = ep->xid; /* get our XID */ + } else if (!ep) { + atomic_inc(&mp->stats.xid_not_found); + reject = FC_RJT_RX_ID; /* XID not found */ + goto out; + } + } + + /* + * At this point, we have the exchange held. + * Find or create the sequence. + */ + if (fc_sof_is_init(fr_sof(fp))) { + sp = fc_seq_start_next(&ep->seq); + if (!sp) { + reject = FC_RJT_SEQ_XS; /* exchange shortage */ + goto rel; + } + sp->id = fh->fh_seq_id; + sp->ssb_stat |= SSB_ST_RESP; + } else { + sp = &ep->seq; + if (sp->id != fh->fh_seq_id) { + atomic_inc(&mp->stats.seq_not_found); + reject = FC_RJT_SEQ_ID; /* sequence/exch should exist */ + goto rel; + } + } + WARN_ON(ep != fc_seq_exch(sp)); + + if (f_ctl & FC_FC_SEQ_INIT) + ep->esb_stat |= ESB_ST_SEQ_INIT; + + fr_seq(fp) = sp; +out: + return reject; +rel: + fc_exch_done(&ep->seq); + fc_exch_release(ep); /* hold from fc_exch_find/fc_exch_resp */ + return reject; +} + +/* + * Find the sequence for a frame being received. + * We originated the sequence, so it should be found. + * We may or may not have originated the exchange. + * Does not hold the sequence for the caller. + */ +static struct fc_seq *fc_seq_lookup_orig(struct fc_exch_mgr *mp, + struct fc_frame *fp) +{ + struct fc_frame_header *fh = fc_frame_header_get(fp); + struct fc_exch *ep; + struct fc_seq *sp = NULL; + u32 f_ctl; + u16 xid; + + f_ctl = ntoh24(fh->fh_f_ctl); + WARN_ON((f_ctl & FC_FC_SEQ_CTX) != FC_FC_SEQ_CTX); + xid = ntohs((f_ctl & FC_FC_EX_CTX) ? fh->fh_ox_id : fh->fh_rx_id); + ep = fc_exch_find(mp, xid); + if (!ep) + return NULL; + if (ep->seq.id == fh->fh_seq_id) { + /* + * Save the RX_ID if we didn't previously know it. + */ + sp = &ep->seq; + if ((f_ctl & FC_FC_EX_CTX) != 0 && + ep->rxid == FC_XID_UNKNOWN) { + ep->rxid = ntohs(fh->fh_rx_id); + } + } + fc_exch_release(ep); + return sp; +} + +/* + * Set addresses for an exchange. + * Note this must be done before the first sequence of the exchange is sent. + */ +static void fc_exch_set_addr(struct fc_exch *ep, + u32 orig_id, u32 resp_id) +{ + ep->oid = orig_id; + if (ep->esb_stat & ESB_ST_RESP) { + ep->sid = resp_id; + ep->did = orig_id; + } else { + ep->sid = orig_id; + ep->did = resp_id; + } +} + +static struct fc_seq *fc_seq_start_next_locked(struct fc_seq *sp) +{ + struct fc_exch *ep = fc_seq_exch(sp); + + sp = fc_seq_alloc(ep, ep->seq_id++); + FC_DEBUG_EXCH("exch %4x f_ctl %6x seq %2x\n", + ep->xid, ep->f_ctl, sp->id); + return sp; +} +/* + * Allocate a new sequence on the same exchange as the supplied sequence. + * This will never return NULL. + */ +struct fc_seq *fc_seq_start_next(struct fc_seq *sp) +{ + struct fc_exch *ep = fc_seq_exch(sp); + + spin_lock_bh(&ep->ex_lock); + WARN_ON((ep->esb_stat & ESB_ST_COMPLETE) != 0); + sp = fc_seq_start_next_locked(sp); + spin_unlock_bh(&ep->ex_lock); + + return sp; +} +EXPORT_SYMBOL(fc_seq_start_next); + +int fc_seq_send(struct fc_lport *lp, struct fc_seq *sp, struct fc_frame *fp) +{ + struct fc_exch *ep; + struct fc_frame_header *fh = fc_frame_header_get(fp); + int error; + u32 f_ctl; + + ep = fc_seq_exch(sp); + WARN_ON((ep->esb_stat & ESB_ST_SEQ_INIT) != ESB_ST_SEQ_INIT); + + f_ctl = ntoh24(fh->fh_f_ctl); + fc_exch_setup_hdr(ep, fp, f_ctl); + + /* + * update sequence count if this frame is carrying + * multiple FC frames when sequence offload is enabled + * by LLD. + */ + if (fr_max_payload(fp)) + sp->cnt += DIV_ROUND_UP((fr_len(fp) - sizeof(*fh)), + fr_max_payload(fp)); + else + sp->cnt++; + + /* + * Send the frame. + */ + error = lp->tt.frame_send(lp, fp); + + /* + * Update the exchange and sequence flags, + * assuming all frames for the sequence have been sent. + * We can only be called to send once for each sequence. + */ + spin_lock_bh(&ep->ex_lock); + ep->f_ctl = f_ctl & ~FC_FC_FIRST_SEQ; /* not first seq */ + if (f_ctl & (FC_FC_END_SEQ | FC_FC_SEQ_INIT)) + ep->esb_stat &= ~ESB_ST_SEQ_INIT; + spin_unlock_bh(&ep->ex_lock); + return error; +} +EXPORT_SYMBOL(fc_seq_send); + +void fc_seq_els_rsp_send(struct fc_seq *sp, enum fc_els_cmd els_cmd, + struct fc_seq_els_data *els_data) +{ + switch (els_cmd) { + case ELS_LS_RJT: + fc_seq_ls_rjt(sp, els_data->reason, els_data->explan); + break; + case ELS_LS_ACC: + fc_seq_ls_acc(sp); + break; + case ELS_RRQ: + fc_exch_els_rrq(sp, els_data->fp); + break; + case ELS_REC: + fc_exch_els_rec(sp, els_data->fp); + break; + default: + FC_DBG("Invalid ELS CMD:%x\n", els_cmd); + } +} +EXPORT_SYMBOL(fc_seq_els_rsp_send); + +/* + * Send a sequence, which is also the last sequence in the exchange. + */ +static void fc_seq_send_last(struct fc_seq *sp, struct fc_frame *fp, + enum fc_rctl rctl, enum fc_fh_type fh_type) +{ + u32 f_ctl; + struct fc_exch *ep = fc_seq_exch(sp); + + f_ctl = FC_FC_LAST_SEQ | FC_FC_END_SEQ | FC_FC_SEQ_INIT; + f_ctl |= ep->f_ctl; + fc_fill_fc_hdr(fp, rctl, ep->did, ep->sid, fh_type, f_ctl, 0); + fc_seq_send(ep->lp, sp, fp); +} + +/* + * Send ACK_1 (or equiv.) indicating we received something. + * The frame we're acking is supplied. + */ +static void fc_seq_send_ack(struct fc_seq *sp, const struct fc_frame *rx_fp) +{ + struct fc_frame *fp; + struct fc_frame_header *rx_fh; + struct fc_frame_header *fh; + struct fc_exch *ep = fc_seq_exch(sp); + struct fc_lport *lp = ep->lp; + unsigned int f_ctl; + + /* + * Don't send ACKs for class 3. + */ + if (fc_sof_needs_ack(fr_sof(rx_fp))) { + fp = fc_frame_alloc(lp, 0); + if (!fp) + return; + + fh = fc_frame_header_get(fp); + fh->fh_r_ctl = FC_RCTL_ACK_1; + fh->fh_type = FC_TYPE_BLS; + + /* + * Form f_ctl by inverting EX_CTX and SEQ_CTX (bits 23, 22). + * Echo FIRST_SEQ, LAST_SEQ, END_SEQ, END_CONN, SEQ_INIT. + * Bits 9-8 are meaningful (retransmitted or unidirectional). + * Last ACK uses bits 7-6 (continue sequence), + * bits 5-4 are meaningful (what kind of ACK to use). + */ + rx_fh = fc_frame_header_get(rx_fp); + f_ctl = ntoh24(rx_fh->fh_f_ctl); + f_ctl &= FC_FC_EX_CTX | FC_FC_SEQ_CTX | + FC_FC_FIRST_SEQ | FC_FC_LAST_SEQ | + FC_FC_END_SEQ | FC_FC_END_CONN | FC_FC_SEQ_INIT | + FC_FC_RETX_SEQ | FC_FC_UNI_TX; + f_ctl ^= FC_FC_EX_CTX | FC_FC_SEQ_CTX; + hton24(fh->fh_f_ctl, f_ctl); + + fc_exch_setup_hdr(ep, fp, f_ctl); + fh->fh_seq_id = rx_fh->fh_seq_id; + fh->fh_seq_cnt = rx_fh->fh_seq_cnt; + fh->fh_parm_offset = htonl(1); /* ack single frame */ + + fr_sof(fp) = fr_sof(rx_fp); + if (f_ctl & FC_FC_END_SEQ) + fr_eof(fp) = FC_EOF_T; + else + fr_eof(fp) = FC_EOF_N; + + (void) lp->tt.frame_send(lp, fp); + } +} + +/* + * Send BLS Reject. + * This is for rejecting BA_ABTS only. + */ +static void +fc_exch_send_ba_rjt(struct fc_frame *rx_fp, enum fc_ba_rjt_reason reason, + enum fc_ba_rjt_explan explan) +{ + struct fc_frame *fp; + struct fc_frame_header *rx_fh; + struct fc_frame_header *fh; + struct fc_ba_rjt *rp; + struct fc_lport *lp; + unsigned int f_ctl; + + lp = fr_dev(rx_fp); + fp = fc_frame_alloc(lp, sizeof(*rp)); + if (!fp) + return; + fh = fc_frame_header_get(fp); + rx_fh = fc_frame_header_get(rx_fp); + + memset(fh, 0, sizeof(*fh) + sizeof(*rp)); + + rp = fc_frame_payload_get(fp, sizeof(*rp)); + rp->br_reason = reason; + rp->br_explan = explan; + + /* + * seq_id, cs_ctl, df_ctl and param/offset are zero. + */ + memcpy(fh->fh_s_id, rx_fh->fh_d_id, 3); + memcpy(fh->fh_d_id, rx_fh->fh_s_id, 3); + fh->fh_ox_id = rx_fh->fh_rx_id; + fh->fh_rx_id = rx_fh->fh_ox_id; + fh->fh_seq_cnt = rx_fh->fh_seq_cnt; + fh->fh_r_ctl = FC_RCTL_BA_RJT; + fh->fh_type = FC_TYPE_BLS; + + /* + * Form f_ctl by inverting EX_CTX and SEQ_CTX (bits 23, 22). + * Echo FIRST_SEQ, LAST_SEQ, END_SEQ, END_CONN, SEQ_INIT. + * Bits 9-8 are meaningful (retransmitted or unidirectional). + * Last ACK uses bits 7-6 (continue sequence), + * bits 5-4 are meaningful (what kind of ACK to use). + * Always set LAST_SEQ, END_SEQ. + */ + f_ctl = ntoh24(rx_fh->fh_f_ctl); + f_ctl &= FC_FC_EX_CTX | FC_FC_SEQ_CTX | + FC_FC_END_CONN | FC_FC_SEQ_INIT | + FC_FC_RETX_SEQ | FC_FC_UNI_TX; + f_ctl ^= FC_FC_EX_CTX | FC_FC_SEQ_CTX; + f_ctl |= FC_FC_LAST_SEQ | FC_FC_END_SEQ; + f_ctl &= ~FC_FC_FIRST_SEQ; + hton24(fh->fh_f_ctl, f_ctl); + + fr_sof(fp) = fc_sof_class(fr_sof(rx_fp)); + fr_eof(fp) = FC_EOF_T; + if (fc_sof_needs_ack(fr_sof(fp))) + fr_eof(fp) = FC_EOF_N; + + (void) lp->tt.frame_send(lp, fp); +} + +/* + * Handle an incoming ABTS. This would be for target mode usually, + * but could be due to lost FCP transfer ready, confirm or RRQ. + * We always handle this as an exchange abort, ignoring the parameter. + */ +static void fc_exch_recv_abts(struct fc_exch *ep, struct fc_frame *rx_fp) +{ + struct fc_frame *fp; + struct fc_ba_acc *ap; + struct fc_frame_header *fh; + struct fc_seq *sp; + + if (!ep) + goto reject; + spin_lock_bh(&ep->ex_lock); + if (ep->esb_stat & ESB_ST_COMPLETE) { + spin_unlock_bh(&ep->ex_lock); + goto reject; + } + if (!(ep->esb_stat & ESB_ST_REC_QUAL)) + fc_exch_hold(ep); /* hold for REC_QUAL */ + ep->esb_stat |= ESB_ST_ABNORMAL | ESB_ST_REC_QUAL; + fc_exch_timer_set_locked(ep, ep->r_a_tov); + + fp = fc_frame_alloc(ep->lp, sizeof(*ap)); + if (!fp) { + spin_unlock_bh(&ep->ex_lock); + goto free; + } + fh = fc_frame_header_get(fp); + ap = fc_frame_payload_get(fp, sizeof(*ap)); + memset(ap, 0, sizeof(*ap)); + sp = &ep->seq; + ap->ba_high_seq_cnt = htons(0xffff); + if (sp->ssb_stat & SSB_ST_RESP) { + ap->ba_seq_id = sp->id; + ap->ba_seq_id_val = FC_BA_SEQ_ID_VAL; + ap->ba_high_seq_cnt = fh->fh_seq_cnt; + ap->ba_low_seq_cnt = htons(sp->cnt); + } + sp = fc_seq_start_next(sp); + spin_unlock_bh(&ep->ex_lock); + fc_seq_send_last(sp, fp, FC_RCTL_BA_ACC, FC_TYPE_BLS); + fc_frame_free(rx_fp); + return; + +reject: + fc_exch_send_ba_rjt(rx_fp, FC_BA_RJT_UNABLE, FC_BA_RJT_INV_XID); +free: + fc_frame_free(rx_fp); +} + +/* + * Handle receive where the other end is originating the sequence. + */ +static void fc_exch_recv_req(struct fc_lport *lp, struct fc_exch_mgr *mp, + struct fc_frame *fp) +{ + struct fc_frame_header *fh = fc_frame_header_get(fp); + struct fc_seq *sp = NULL; + struct fc_exch *ep = NULL; + enum fc_sof sof; + enum fc_eof eof; + u32 f_ctl; + enum fc_pf_rjt_reason reject; + + fr_seq(fp) = NULL; + reject = fc_seq_lookup_recip(mp, fp); + if (reject == FC_RJT_NONE) { + sp = fr_seq(fp); /* sequence will be held */ + ep = fc_seq_exch(sp); + sof = fr_sof(fp); + eof = fr_eof(fp); + f_ctl = ntoh24(fh->fh_f_ctl); + fc_seq_send_ack(sp, fp); + + /* + * Call the receive function. + * + * The receive function may allocate a new sequence + * over the old one, so we shouldn't change the + * sequence after this. + * + * The frame will be freed by the receive function. + * If new exch resp handler is valid then call that + * first. + */ + if (ep->resp) + ep->resp(sp, fp, ep->arg); + else + lp->tt.lport_recv(lp, sp, fp); + fc_exch_release(ep); /* release from lookup */ + } else { + FC_DEBUG_EXCH("exch/seq lookup failed: reject %x\n", reject); + fc_frame_free(fp); + } +} + +/* + * Handle receive where the other end is originating the sequence in + * response to our exchange. + */ +static void fc_exch_recv_seq_resp(struct fc_exch_mgr *mp, struct fc_frame *fp) +{ + struct fc_frame_header *fh = fc_frame_header_get(fp); + struct fc_seq *sp; + struct fc_exch *ep; + enum fc_sof sof; + u32 f_ctl; + void (*resp)(struct fc_seq *, struct fc_frame *fp, void *arg); + void *ex_resp_arg; + int rc; + + ep = fc_exch_find(mp, ntohs(fh->fh_ox_id)); + if (!ep) { + atomic_inc(&mp->stats.xid_not_found); + goto out; + } + if (ep->rxid == FC_XID_UNKNOWN) + ep->rxid = ntohs(fh->fh_rx_id); + if (ep->sid != 0 && ep->sid != ntoh24(fh->fh_d_id)) { + atomic_inc(&mp->stats.xid_not_found); + goto rel; + } + if (ep->did != ntoh24(fh->fh_s_id) && + ep->did != FC_FID_FLOGI) { + atomic_inc(&mp->stats.xid_not_found); + goto rel; + } + sof = fr_sof(fp); + if (fc_sof_is_init(sof)) { + sp = fc_seq_start_next(&ep->seq); + sp->id = fh->fh_seq_id; + sp->ssb_stat |= SSB_ST_RESP; + } else { + sp = &ep->seq; + if (sp->id != fh->fh_seq_id) { + atomic_inc(&mp->stats.seq_not_found); + goto rel; + } + } + f_ctl = ntoh24(fh->fh_f_ctl); + fr_seq(fp) = sp; + if (f_ctl & FC_FC_SEQ_INIT) + ep->esb_stat |= ESB_ST_SEQ_INIT; + + if (fc_sof_needs_ack(sof)) + fc_seq_send_ack(sp, fp); + resp = ep->resp; + ex_resp_arg = ep->arg; + + if (fh->fh_type != FC_TYPE_FCP && fr_eof(fp) == FC_EOF_T && + (f_ctl & (FC_FC_LAST_SEQ | FC_FC_END_SEQ)) == + (FC_FC_LAST_SEQ | FC_FC_END_SEQ)) { + spin_lock_bh(&ep->ex_lock); + rc = fc_exch_done_locked(ep); + WARN_ON(fc_seq_exch(sp) != ep); + spin_unlock_bh(&ep->ex_lock); + if (!rc) + fc_exch_mgr_delete_ep(ep); + } + + /* + * Call the receive function. + * The sequence is held (has a refcnt) for us, + * but not for the receive function. + * + * The receive function may allocate a new sequence + * over the old one, so we shouldn't change the + * sequence after this. + * + * The frame will be freed by the receive function. + * If new exch resp handler is valid then call that + * first. + */ + if (resp) + resp(sp, fp, ex_resp_arg); + else + fc_frame_free(fp); + fc_exch_release(ep); + return; +rel: + fc_exch_release(ep); +out: + fc_frame_free(fp); +} + +/* + * Handle receive for a sequence where other end is responding to our sequence. + */ +static void fc_exch_recv_resp(struct fc_exch_mgr *mp, struct fc_frame *fp) +{ + struct fc_seq *sp; + + sp = fc_seq_lookup_orig(mp, fp); /* doesn't hold sequence */ + if (!sp) { + atomic_inc(&mp->stats.xid_not_found); + FC_DEBUG_EXCH("seq lookup failed\n"); + } else { + atomic_inc(&mp->stats.non_bls_resp); + FC_DEBUG_EXCH("non-BLS response to sequence"); + } + fc_frame_free(fp); +} + +/* + * Handle the response to an ABTS for exchange or sequence. + * This can be BA_ACC or BA_RJT. + */ +static void fc_exch_abts_resp(struct fc_exch *ep, struct fc_frame *fp) +{ + void (*resp)(struct fc_seq *, struct fc_frame *fp, void *arg); + void *ex_resp_arg; + struct fc_frame_header *fh; + struct fc_ba_acc *ap; + struct fc_seq *sp; + u16 low; + u16 high; + int rc = 1, has_rec = 0; + + fh = fc_frame_header_get(fp); + FC_DEBUG_EXCH("exch: BLS rctl %x - %s\n", + fh->fh_r_ctl, fc_exch_rctl_name(fh->fh_r_ctl)); + + if (cancel_delayed_work_sync(&ep->timeout_work)) + fc_exch_release(ep); /* release from pending timer hold */ + + spin_lock_bh(&ep->ex_lock); + switch (fh->fh_r_ctl) { + case FC_RCTL_BA_ACC: + ap = fc_frame_payload_get(fp, sizeof(*ap)); + if (!ap) + break; + + /* + * Decide whether to establish a Recovery Qualifier. + * We do this if there is a non-empty SEQ_CNT range and + * SEQ_ID is the same as the one we aborted. + */ + low = ntohs(ap->ba_low_seq_cnt); + high = ntohs(ap->ba_high_seq_cnt); + if ((ep->esb_stat & ESB_ST_REC_QUAL) == 0 && + (ap->ba_seq_id_val != FC_BA_SEQ_ID_VAL || + ap->ba_seq_id == ep->seq_id) && low != high) { + ep->esb_stat |= ESB_ST_REC_QUAL; + fc_exch_hold(ep); /* hold for recovery qualifier */ + has_rec = 1; + } + break; + case FC_RCTL_BA_RJT: + break; + default: + break; + } + + resp = ep->resp; + ex_resp_arg = ep->arg; + + /* do we need to do some other checks here. Can we reuse more of + * fc_exch_recv_seq_resp + */ + sp = &ep->seq; + /* + * do we want to check END_SEQ as well as LAST_SEQ here? + */ + if (ep->fh_type != FC_TYPE_FCP && + ntoh24(fh->fh_f_ctl) & FC_FC_LAST_SEQ) + rc = fc_exch_done_locked(ep); + spin_unlock_bh(&ep->ex_lock); + if (!rc) + fc_exch_mgr_delete_ep(ep); + + if (resp) + resp(sp, fp, ex_resp_arg); + else + fc_frame_free(fp); + + if (has_rec) + fc_exch_timer_set(ep, ep->r_a_tov); + +} + +/* + * Receive BLS sequence. + * This is always a sequence initiated by the remote side. + * We may be either the originator or recipient of the exchange. + */ +static void fc_exch_recv_bls(struct fc_exch_mgr *mp, struct fc_frame *fp) +{ + struct fc_frame_header *fh; + struct fc_exch *ep; + u32 f_ctl; + + fh = fc_frame_header_get(fp); + f_ctl = ntoh24(fh->fh_f_ctl); + fr_seq(fp) = NULL; + + ep = fc_exch_find(mp, (f_ctl & FC_FC_EX_CTX) ? + ntohs(fh->fh_ox_id) : ntohs(fh->fh_rx_id)); + if (ep && (f_ctl & FC_FC_SEQ_INIT)) { + spin_lock_bh(&ep->ex_lock); + ep->esb_stat |= ESB_ST_SEQ_INIT; + spin_unlock_bh(&ep->ex_lock); + } + if (f_ctl & FC_FC_SEQ_CTX) { + /* + * A response to a sequence we initiated. + * This should only be ACKs for class 2 or F. + */ + switch (fh->fh_r_ctl) { + case FC_RCTL_ACK_1: + case FC_RCTL_ACK_0: + break; + default: + FC_DEBUG_EXCH("BLS rctl %x - %s received", + fh->fh_r_ctl, + fc_exch_rctl_name(fh->fh_r_ctl)); + break; + } + fc_frame_free(fp); + } else { + switch (fh->fh_r_ctl) { + case FC_RCTL_BA_RJT: + case FC_RCTL_BA_ACC: + if (ep) + fc_exch_abts_resp(ep, fp); + else + fc_frame_free(fp); + break; + case FC_RCTL_BA_ABTS: + fc_exch_recv_abts(ep, fp); + break; + default: /* ignore junk */ + fc_frame_free(fp); + break; + } + } + if (ep) + fc_exch_release(ep); /* release hold taken by fc_exch_find */ +} + +/* + * Accept sequence with LS_ACC. + * If this fails due to allocation or transmit congestion, assume the + * originator will repeat the sequence. + */ +static void fc_seq_ls_acc(struct fc_seq *req_sp) +{ + struct fc_seq *sp; + struct fc_els_ls_acc *acc; + struct fc_frame *fp; + + sp = fc_seq_start_next(req_sp); + fp = fc_frame_alloc(fc_seq_exch(sp)->lp, sizeof(*acc)); + if (fp) { + acc = fc_frame_payload_get(fp, sizeof(*acc)); + memset(acc, 0, sizeof(*acc)); + acc->la_cmd = ELS_LS_ACC; + fc_seq_send_last(sp, fp, FC_RCTL_ELS_REP, FC_TYPE_ELS); + } +} + +/* + * Reject sequence with ELS LS_RJT. + * If this fails due to allocation or transmit congestion, assume the + * originator will repeat the sequence. + */ +static void fc_seq_ls_rjt(struct fc_seq *req_sp, enum fc_els_rjt_reason reason, + enum fc_els_rjt_explan explan) +{ + struct fc_seq *sp; + struct fc_els_ls_rjt *rjt; + struct fc_frame *fp; + + sp = fc_seq_start_next(req_sp); + fp = fc_frame_alloc(fc_seq_exch(sp)->lp, sizeof(*rjt)); + if (fp) { + rjt = fc_frame_payload_get(fp, sizeof(*rjt)); + memset(rjt, 0, sizeof(*rjt)); + rjt->er_cmd = ELS_LS_RJT; + rjt->er_reason = reason; + rjt->er_explan = explan; + fc_seq_send_last(sp, fp, FC_RCTL_ELS_REP, FC_TYPE_ELS); + } +} + +static void fc_exch_reset(struct fc_exch *ep) +{ + struct fc_seq *sp; + void (*resp)(struct fc_seq *, struct fc_frame *, void *); + void *arg; + int rc = 1; + + spin_lock_bh(&ep->ex_lock); + ep->state |= FC_EX_RST_CLEANUP; + /* + * we really want to call del_timer_sync, but cannot due + * to the lport calling with the lport lock held (some resp + * functions can also grab the lport lock which could cause + * a deadlock). + */ + if (cancel_delayed_work(&ep->timeout_work)) + atomic_dec(&ep->ex_refcnt); /* drop hold for timer */ + resp = ep->resp; + ep->resp = NULL; + if (ep->esb_stat & ESB_ST_REC_QUAL) + atomic_dec(&ep->ex_refcnt); /* drop hold for rec_qual */ + ep->esb_stat &= ~ESB_ST_REC_QUAL; + arg = ep->arg; + sp = &ep->seq; + rc = fc_exch_done_locked(ep); + spin_unlock_bh(&ep->ex_lock); + if (!rc) + fc_exch_mgr_delete_ep(ep); + + if (resp) + resp(sp, ERR_PTR(-FC_EX_CLOSED), arg); +} + +/* + * Reset an exchange manager, releasing all sequences and exchanges. + * If sid is non-zero, reset only exchanges we source from that FID. + * If did is non-zero, reset only exchanges destined to that FID. + */ +void fc_exch_mgr_reset(struct fc_exch_mgr *mp, u32 sid, u32 did) +{ + struct fc_exch *ep; + struct fc_exch *next; + + spin_lock_bh(&mp->em_lock); +restart: + list_for_each_entry_safe(ep, next, &mp->ex_list, ex_list) { + if ((sid == 0 || sid == ep->sid) && + (did == 0 || did == ep->did)) { + fc_exch_hold(ep); + spin_unlock_bh(&mp->em_lock); + + fc_exch_reset(ep); + + fc_exch_release(ep); + spin_lock_bh(&mp->em_lock); + + /* + * must restart loop incase while lock was down + * multiple eps were released. + */ + goto restart; + } + } + spin_unlock_bh(&mp->em_lock); +} +EXPORT_SYMBOL(fc_exch_mgr_reset); + +/* + * Handle incoming ELS REC - Read Exchange Concise. + * Note that the requesting port may be different than the S_ID in the request. + */ +static void fc_exch_els_rec(struct fc_seq *sp, struct fc_frame *rfp) +{ + struct fc_frame *fp; + struct fc_exch *ep; + struct fc_exch_mgr *em; + struct fc_els_rec *rp; + struct fc_els_rec_acc *acc; + enum fc_els_rjt_reason reason = ELS_RJT_LOGIC; + enum fc_els_rjt_explan explan; + u32 sid; + u16 rxid; + u16 oxid; + + rp = fc_frame_payload_get(rfp, sizeof(*rp)); + explan = ELS_EXPL_INV_LEN; + if (!rp) + goto reject; + sid = ntoh24(rp->rec_s_id); + rxid = ntohs(rp->rec_rx_id); + oxid = ntohs(rp->rec_ox_id); + + /* + * Currently it's hard to find the local S_ID from the exchange + * manager. This will eventually be fixed, but for now it's easier + * to lookup the subject exchange twice, once as if we were + * the initiator, and then again if we weren't. + */ + em = fc_seq_exch(sp)->em; + ep = fc_exch_find(em, oxid); + explan = ELS_EXPL_OXID_RXID; + if (ep && ep->oid == sid) { + if (ep->rxid != FC_XID_UNKNOWN && + rxid != FC_XID_UNKNOWN && + ep->rxid != rxid) + goto rel; + } else { + if (ep) + fc_exch_release(ep); + ep = NULL; + if (rxid != FC_XID_UNKNOWN) + ep = fc_exch_find(em, rxid); + if (!ep) + goto reject; + } + + fp = fc_frame_alloc(fc_seq_exch(sp)->lp, sizeof(*acc)); + if (!fp) { + fc_exch_done(sp); + goto out; + } + sp = fc_seq_start_next(sp); + acc = fc_frame_payload_get(fp, sizeof(*acc)); + memset(acc, 0, sizeof(*acc)); + acc->reca_cmd = ELS_LS_ACC; + acc->reca_ox_id = rp->rec_ox_id; + memcpy(acc->reca_ofid, rp->rec_s_id, 3); + acc->reca_rx_id = htons(ep->rxid); + if (ep->sid == ep->oid) + hton24(acc->reca_rfid, ep->did); + else + hton24(acc->reca_rfid, ep->sid); + acc->reca_fc4value = htonl(ep->seq.rec_data); + acc->reca_e_stat = htonl(ep->esb_stat & (ESB_ST_RESP | + ESB_ST_SEQ_INIT | + ESB_ST_COMPLETE)); + sp = fc_seq_start_next(sp); + fc_seq_send_last(sp, fp, FC_RCTL_ELS_REP, FC_TYPE_ELS); +out: + fc_exch_release(ep); + fc_frame_free(rfp); + return; + +rel: + fc_exch_release(ep); +reject: + fc_seq_ls_rjt(sp, reason, explan); + fc_frame_free(rfp); +} + +/* + * Handle response from RRQ. + * Not much to do here, really. + * Should report errors. + * + * TODO: fix error handler. + */ +static void fc_exch_rrq_resp(struct fc_seq *sp, struct fc_frame *fp, void *arg) +{ + struct fc_exch *aborted_ep = arg; + unsigned int op; + + if (IS_ERR(fp)) { + int err = PTR_ERR(fp); + + if (err == -FC_EX_CLOSED) + goto cleanup; + FC_DBG("Cannot process RRQ, because of frame error %d\n", err); + return; + } + + op = fc_frame_payload_op(fp); + fc_frame_free(fp); + + switch (op) { + case ELS_LS_RJT: + FC_DBG("LS_RJT for RRQ"); + /* fall through */ + case ELS_LS_ACC: + goto cleanup; + default: + FC_DBG("unexpected response op %x for RRQ", op); + return; + } + +cleanup: + fc_exch_done(&aborted_ep->seq); + /* drop hold for rec qual */ + fc_exch_release(aborted_ep); +} + +/* + * Send ELS RRQ - Reinstate Recovery Qualifier. + * This tells the remote port to stop blocking the use of + * the exchange and the seq_cnt range. + */ +static void fc_exch_rrq(struct fc_exch *ep) +{ + struct fc_lport *lp; + struct fc_els_rrq *rrq; + struct fc_frame *fp; + struct fc_seq *rrq_sp; + u32 did; + + lp = ep->lp; + + fp = fc_frame_alloc(lp, sizeof(*rrq)); + if (!fp) + return; + rrq = fc_frame_payload_get(fp, sizeof(*rrq)); + memset(rrq, 0, sizeof(*rrq)); + rrq->rrq_cmd = ELS_RRQ; + hton24(rrq->rrq_s_id, ep->sid); + rrq->rrq_ox_id = htons(ep->oxid); + rrq->rrq_rx_id = htons(ep->rxid); + + did = ep->did; + if (ep->esb_stat & ESB_ST_RESP) + did = ep->sid; + + fc_fill_fc_hdr(fp, FC_RCTL_ELS_REQ, did, + fc_host_port_id(lp->host), FC_TYPE_ELS, + FC_FC_FIRST_SEQ | FC_FC_END_SEQ | FC_FC_SEQ_INIT, 0); + + rrq_sp = fc_exch_seq_send(lp, fp, fc_exch_rrq_resp, NULL, ep, + lp->e_d_tov); + if (!rrq_sp) { + ep->esb_stat |= ESB_ST_REC_QUAL; + fc_exch_timer_set_locked(ep, ep->r_a_tov); + return; + } +} + + +/* + * Handle incoming ELS RRQ - Reset Recovery Qualifier. + */ +static void fc_exch_els_rrq(struct fc_seq *sp, struct fc_frame *fp) +{ + struct fc_exch *ep; /* request or subject exchange */ + struct fc_els_rrq *rp; + u32 sid; + u16 xid; + enum fc_els_rjt_explan explan; + + rp = fc_frame_payload_get(fp, sizeof(*rp)); + explan = ELS_EXPL_INV_LEN; + if (!rp) + goto reject; + + /* + * lookup subject exchange. + */ + ep = fc_seq_exch(sp); + sid = ntoh24(rp->rrq_s_id); /* subject source */ + xid = ep->did == sid ? ntohs(rp->rrq_ox_id) : ntohs(rp->rrq_rx_id); + ep = fc_exch_find(ep->em, xid); + + explan = ELS_EXPL_OXID_RXID; + if (!ep) + goto reject; + spin_lock_bh(&ep->ex_lock); + if (ep->oxid != ntohs(rp->rrq_ox_id)) + goto unlock_reject; + if (ep->rxid != ntohs(rp->rrq_rx_id) && + ep->rxid != FC_XID_UNKNOWN) + goto unlock_reject; + explan = ELS_EXPL_SID; + if (ep->sid != sid) + goto unlock_reject; + + /* + * Clear Recovery Qualifier state, and cancel timer if complete. + */ + if (ep->esb_stat & ESB_ST_REC_QUAL) { + ep->esb_stat &= ~ESB_ST_REC_QUAL; + atomic_dec(&ep->ex_refcnt); /* drop hold for rec qual */ + } + if (ep->esb_stat & ESB_ST_COMPLETE) { + if (cancel_delayed_work(&ep->timeout_work)) + atomic_dec(&ep->ex_refcnt); /* drop timer hold */ + } + + spin_unlock_bh(&ep->ex_lock); + + /* + * Send LS_ACC. + */ + fc_seq_ls_acc(sp); + fc_frame_free(fp); + return; + +unlock_reject: + spin_unlock_bh(&ep->ex_lock); + fc_exch_release(ep); /* drop hold from fc_exch_find */ +reject: + fc_seq_ls_rjt(sp, ELS_RJT_LOGIC, explan); + fc_frame_free(fp); +} + +struct fc_exch_mgr *fc_exch_mgr_alloc(struct fc_lport *lp, + enum fc_class class, + u16 min_xid, u16 max_xid) +{ + struct fc_exch_mgr *mp; + size_t len; + + if (max_xid <= min_xid || min_xid == 0 || max_xid == FC_XID_UNKNOWN) { + FC_DBG("Invalid min_xid 0x:%x and max_xid 0x:%x\n", + min_xid, max_xid); + return NULL; + } + + /* + * Memory need for EM + */ +#define xid_ok(i, m1, m2) (((i) >= (m1)) && ((i) <= (m2))) + len = (max_xid - min_xid + 1) * (sizeof(struct fc_exch *)); + len += sizeof(struct fc_exch_mgr); + + mp = kzalloc(len, GFP_ATOMIC); + if (!mp) + return NULL; + + mp->class = class; + mp->total_exches = 0; + mp->exches = (struct fc_exch **)(mp + 1); + mp->lp = lp; + /* adjust em exch xid range for offload */ + mp->min_xid = min_xid; + mp->max_xid = max_xid; + mp->last_xid = min_xid - 1; + mp->max_read = 0; + mp->last_read = 0; + if (lp->lro_enabled && xid_ok(lp->lro_xid, min_xid, max_xid)) { + mp->max_read = lp->lro_xid; + mp->last_read = min_xid - 1; + mp->last_xid = mp->max_read; + } else { + /* disable lro if no xid control over read */ + lp->lro_enabled = 0; + } + + INIT_LIST_HEAD(&mp->ex_list); + spin_lock_init(&mp->em_lock); + + mp->ep_pool = mempool_create_slab_pool(2, fc_em_cachep); + if (!mp->ep_pool) + goto free_mp; + + return mp; + +free_mp: + kfree(mp); + return NULL; +} +EXPORT_SYMBOL(fc_exch_mgr_alloc); + +void fc_exch_mgr_free(struct fc_exch_mgr *mp) +{ + WARN_ON(!mp); + /* + * The total exch count must be zero + * before freeing exchange manager. + */ + WARN_ON(mp->total_exches != 0); + mempool_destroy(mp->ep_pool); + kfree(mp); +} +EXPORT_SYMBOL(fc_exch_mgr_free); + +struct fc_exch *fc_exch_get(struct fc_lport *lp, struct fc_frame *fp) +{ + if (!lp || !lp->emp) + return NULL; + + return fc_exch_alloc(lp->emp, fp, 0); +} +EXPORT_SYMBOL(fc_exch_get); + +struct fc_seq *fc_exch_seq_send(struct fc_lport *lp, + struct fc_frame *fp, + void (*resp)(struct fc_seq *, + struct fc_frame *fp, + void *arg), + void (*destructor)(struct fc_seq *, void *), + void *arg, u32 timer_msec) +{ + struct fc_exch *ep; + struct fc_seq *sp = NULL; + struct fc_frame_header *fh; + int rc = 1; + + ep = lp->tt.exch_get(lp, fp); + if (!ep) { + fc_frame_free(fp); + return NULL; + } + ep->esb_stat |= ESB_ST_SEQ_INIT; + fh = fc_frame_header_get(fp); + fc_exch_set_addr(ep, ntoh24(fh->fh_s_id), ntoh24(fh->fh_d_id)); + ep->resp = resp; + ep->destructor = destructor; + ep->arg = arg; + ep->r_a_tov = FC_DEF_R_A_TOV; + ep->lp = lp; + sp = &ep->seq; + + ep->fh_type = fh->fh_type; /* save for possbile timeout handling */ + ep->f_ctl = ntoh24(fh->fh_f_ctl); + fc_exch_setup_hdr(ep, fp, ep->f_ctl); + sp->cnt++; + + if (unlikely(lp->tt.frame_send(lp, fp))) + goto err; + + if (timer_msec) + fc_exch_timer_set_locked(ep, timer_msec); + ep->f_ctl &= ~FC_FC_FIRST_SEQ; /* not first seq */ + + if (ep->f_ctl & FC_FC_SEQ_INIT) + ep->esb_stat &= ~ESB_ST_SEQ_INIT; + spin_unlock_bh(&ep->ex_lock); + return sp; +err: + rc = fc_exch_done_locked(ep); + spin_unlock_bh(&ep->ex_lock); + if (!rc) + fc_exch_mgr_delete_ep(ep); + return NULL; +} +EXPORT_SYMBOL(fc_exch_seq_send); + +/* + * Receive a frame + */ +void fc_exch_recv(struct fc_lport *lp, struct fc_exch_mgr *mp, + struct fc_frame *fp) +{ + struct fc_frame_header *fh = fc_frame_header_get(fp); + u32 f_ctl; + + /* lport lock ? */ + if (!lp || !mp || (lp->state == LPORT_ST_NONE)) { + FC_DBG("fc_lport or EM is not allocated and configured"); + fc_frame_free(fp); + return; + } + + /* + * If frame is marked invalid, just drop it. + */ + f_ctl = ntoh24(fh->fh_f_ctl); + switch (fr_eof(fp)) { + case FC_EOF_T: + if (f_ctl & FC_FC_END_SEQ) + skb_trim(fp_skb(fp), fr_len(fp) - FC_FC_FILL(f_ctl)); + /* fall through */ + case FC_EOF_N: + if (fh->fh_type == FC_TYPE_BLS) + fc_exch_recv_bls(mp, fp); + else if ((f_ctl & (FC_FC_EX_CTX | FC_FC_SEQ_CTX)) == + FC_FC_EX_CTX) + fc_exch_recv_seq_resp(mp, fp); + else if (f_ctl & FC_FC_SEQ_CTX) + fc_exch_recv_resp(mp, fp); + else + fc_exch_recv_req(lp, mp, fp); + break; + default: + FC_DBG("dropping invalid frame (eof %x)", fr_eof(fp)); + fc_frame_free(fp); + break; + } +} +EXPORT_SYMBOL(fc_exch_recv); + +int fc_exch_init(struct fc_lport *lp) +{ + if (!lp->tt.exch_get) { + /* + * exch_put() should be NULL if + * exch_get() is NULL + */ + WARN_ON(lp->tt.exch_put); + lp->tt.exch_get = fc_exch_get; + } + + if (!lp->tt.seq_start_next) + lp->tt.seq_start_next = fc_seq_start_next; + + if (!lp->tt.exch_seq_send) + lp->tt.exch_seq_send = fc_exch_seq_send; + + if (!lp->tt.seq_send) + lp->tt.seq_send = fc_seq_send; + + if (!lp->tt.seq_els_rsp_send) + lp->tt.seq_els_rsp_send = fc_seq_els_rsp_send; + + if (!lp->tt.exch_done) + lp->tt.exch_done = fc_exch_done; + + if (!lp->tt.exch_mgr_reset) + lp->tt.exch_mgr_reset = fc_exch_mgr_reset; + + if (!lp->tt.seq_exch_abort) + lp->tt.seq_exch_abort = fc_seq_exch_abort; + + return 0; +} +EXPORT_SYMBOL(fc_exch_init); + +int fc_setup_exch_mgr(void) +{ + fc_em_cachep = kmem_cache_create("libfc_em", sizeof(struct fc_exch), + 0, SLAB_HWCACHE_ALIGN, NULL); + if (!fc_em_cachep) + return -ENOMEM; + return 0; +} + +void fc_destroy_exch_mgr(void) +{ + kmem_cache_destroy(fc_em_cachep); +} diff --git a/drivers/scsi/libfc/fc_fcp.c b/drivers/scsi/libfc/fc_fcp.c new file mode 100644 index 00000000000..404e63ff46b --- /dev/null +++ b/drivers/scsi/libfc/fc_fcp.c @@ -0,0 +1,2131 @@ +/* + * Copyright(c) 2007 Intel Corporation. All rights reserved. + * Copyright(c) 2008 Red Hat, Inc. All rights reserved. + * Copyright(c) 2008 Mike Christie + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + * + * Maintained at www.Open-FCoE.org + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include + +#include +#include + +MODULE_AUTHOR("Open-FCoE.org"); +MODULE_DESCRIPTION("libfc"); +MODULE_LICENSE("GPL"); + +static int fc_fcp_debug; + +#define FC_DEBUG_FCP(fmt...) \ + do { \ + if (fc_fcp_debug) \ + FC_DBG(fmt); \ + } while (0) + +static struct kmem_cache *scsi_pkt_cachep; + +/* SRB state definitions */ +#define FC_SRB_FREE 0 /* cmd is free */ +#define FC_SRB_CMD_SENT (1 << 0) /* cmd has been sent */ +#define FC_SRB_RCV_STATUS (1 << 1) /* response has arrived */ +#define FC_SRB_ABORT_PENDING (1 << 2) /* cmd abort sent to device */ +#define FC_SRB_ABORTED (1 << 3) /* abort acknowleged */ +#define FC_SRB_DISCONTIG (1 << 4) /* non-sequential data recvd */ +#define FC_SRB_COMPL (1 << 5) /* fc_io_compl has been run */ +#define FC_SRB_FCP_PROCESSING_TMO (1 << 6) /* timer function processing */ +#define FC_SRB_NOMEM (1 << 7) /* dropped to out of mem */ + +#define FC_SRB_READ (1 << 1) +#define FC_SRB_WRITE (1 << 0) + +/* + * The SCp.ptr should be tested and set under the host lock. NULL indicates + * that the command has been retruned to the scsi layer. + */ +#define CMD_SP(Cmnd) ((struct fc_fcp_pkt *)(Cmnd)->SCp.ptr) +#define CMD_ENTRY_STATUS(Cmnd) ((Cmnd)->SCp.have_data_in) +#define CMD_COMPL_STATUS(Cmnd) ((Cmnd)->SCp.this_residual) +#define CMD_SCSI_STATUS(Cmnd) ((Cmnd)->SCp.Status) +#define CMD_RESID_LEN(Cmnd) ((Cmnd)->SCp.buffers_residual) + +struct fc_fcp_internal { + mempool_t *scsi_pkt_pool; + struct list_head scsi_pkt_queue; + u8 throttled; +}; + +#define fc_get_scsi_internal(x) ((struct fc_fcp_internal *)(x)->scsi_priv) + +/* + * function prototypes + * FC scsi I/O related functions + */ +static void fc_fcp_recv_data(struct fc_fcp_pkt *, struct fc_frame *); +static void fc_fcp_recv(struct fc_seq *, struct fc_frame *, void *); +static void fc_fcp_resp(struct fc_fcp_pkt *, struct fc_frame *); +static void fc_fcp_complete_locked(struct fc_fcp_pkt *); +static void fc_tm_done(struct fc_seq *, struct fc_frame *, void *); +static void fc_fcp_error(struct fc_fcp_pkt *fsp, struct fc_frame *fp); +static void fc_timeout_error(struct fc_fcp_pkt *); +static void fc_fcp_timeout(unsigned long data); +static void fc_fcp_rec(struct fc_fcp_pkt *); +static void fc_fcp_rec_error(struct fc_fcp_pkt *, struct fc_frame *); +static void fc_fcp_rec_resp(struct fc_seq *, struct fc_frame *, void *); +static void fc_io_compl(struct fc_fcp_pkt *); + +static void fc_fcp_srr(struct fc_fcp_pkt *, enum fc_rctl, u32); +static void fc_fcp_srr_resp(struct fc_seq *, struct fc_frame *, void *); +static void fc_fcp_srr_error(struct fc_fcp_pkt *, struct fc_frame *); + +/* + * command status codes + */ +#define FC_COMPLETE 0 +#define FC_CMD_ABORTED 1 +#define FC_CMD_RESET 2 +#define FC_CMD_PLOGO 3 +#define FC_SNS_RCV 4 +#define FC_TRANS_ERR 5 +#define FC_DATA_OVRRUN 6 +#define FC_DATA_UNDRUN 7 +#define FC_ERROR 8 +#define FC_HRD_ERROR 9 +#define FC_CMD_TIME_OUT 10 + +/* + * Error recovery timeout values. + */ +#define FC_SCSI_ER_TIMEOUT (10 * HZ) +#define FC_SCSI_TM_TOV (10 * HZ) +#define FC_SCSI_REC_TOV (2 * HZ) +#define FC_HOST_RESET_TIMEOUT (30 * HZ) + +#define FC_MAX_ERROR_CNT 5 +#define FC_MAX_RECOV_RETRY 3 + +#define FC_FCP_DFLT_QUEUE_DEPTH 32 + +/** + * fc_fcp_pkt_alloc - allocation routine for scsi_pkt packet + * @lp: fc lport struct + * @gfp: gfp flags for allocation + * + * This is used by upper layer scsi driver. + * Return Value : scsi_pkt structure or null on allocation failure. + * Context : call from process context. no locking required. + */ +static struct fc_fcp_pkt *fc_fcp_pkt_alloc(struct fc_lport *lp, gfp_t gfp) +{ + struct fc_fcp_internal *si = fc_get_scsi_internal(lp); + struct fc_fcp_pkt *fsp; + + fsp = mempool_alloc(si->scsi_pkt_pool, gfp); + if (fsp) { + memset(fsp, 0, sizeof(*fsp)); + fsp->lp = lp; + atomic_set(&fsp->ref_cnt, 1); + init_timer(&fsp->timer); + INIT_LIST_HEAD(&fsp->list); + spin_lock_init(&fsp->scsi_pkt_lock); + } + return fsp; +} + +/** + * fc_fcp_pkt_release - release hold on scsi_pkt packet + * @fsp: fcp packet struct + * + * This is used by upper layer scsi driver. + * Context : call from process and interrupt context. + * no locking required + */ +static void fc_fcp_pkt_release(struct fc_fcp_pkt *fsp) +{ + if (atomic_dec_and_test(&fsp->ref_cnt)) { + struct fc_fcp_internal *si = fc_get_scsi_internal(fsp->lp); + + mempool_free(fsp, si->scsi_pkt_pool); + } +} + +static void fc_fcp_pkt_hold(struct fc_fcp_pkt *fsp) +{ + atomic_inc(&fsp->ref_cnt); +} + +/** + * fc_fcp_pkt_destory - release hold on scsi_pkt packet + * + * @seq: exchange sequence + * @fsp: fcp packet struct + * + * Release hold on scsi_pkt packet set to keep scsi_pkt + * till EM layer exch resource is not freed. + * Context : called from from EM layer. + * no locking required + */ +static void fc_fcp_pkt_destroy(struct fc_seq *seq, void *fsp) +{ + fc_fcp_pkt_release(fsp); +} + +/** + * fc_fcp_lock_pkt - lock a packet and get a ref to it. + * @fsp: fcp packet + * + * We should only return error if we return a command to scsi-ml before + * getting a response. This can happen in cases where we send a abort, but + * do not wait for the response and the abort and command can be passing + * each other on the wire/network-layer. + * + * Note: this function locks the packet and gets a reference to allow + * callers to call the completion function while the lock is held and + * not have to worry about the packets refcount. + * + * TODO: Maybe we should just have callers grab/release the lock and + * have a function that they call to verify the fsp and grab a ref if + * needed. + */ +static inline int fc_fcp_lock_pkt(struct fc_fcp_pkt *fsp) +{ + spin_lock_bh(&fsp->scsi_pkt_lock); + if (fsp->state & FC_SRB_COMPL) { + spin_unlock_bh(&fsp->scsi_pkt_lock); + return -EPERM; + } + + fc_fcp_pkt_hold(fsp); + return 0; +} + +static inline void fc_fcp_unlock_pkt(struct fc_fcp_pkt *fsp) +{ + spin_unlock_bh(&fsp->scsi_pkt_lock); + fc_fcp_pkt_release(fsp); +} + +static void fc_fcp_timer_set(struct fc_fcp_pkt *fsp, unsigned long delay) +{ + if (!(fsp->state & FC_SRB_COMPL)) + mod_timer(&fsp->timer, jiffies + delay); +} + +static int fc_fcp_send_abort(struct fc_fcp_pkt *fsp) +{ + if (!fsp->seq_ptr) + return -EINVAL; + + fsp->state |= FC_SRB_ABORT_PENDING; + return fsp->lp->tt.seq_exch_abort(fsp->seq_ptr, 0); +} + +/* + * Retry command. + * An abort isn't needed. + */ +static void fc_fcp_retry_cmd(struct fc_fcp_pkt *fsp) +{ + if (fsp->seq_ptr) { + fsp->lp->tt.exch_done(fsp->seq_ptr); + fsp->seq_ptr = NULL; + } + + fsp->state &= ~FC_SRB_ABORT_PENDING; + fsp->io_status = SUGGEST_RETRY << 24; + fsp->status_code = FC_ERROR; + fc_fcp_complete_locked(fsp); +} + +/* + * Receive SCSI data from target. + * Called after receiving solicited data. + */ +static void fc_fcp_recv_data(struct fc_fcp_pkt *fsp, struct fc_frame *fp) +{ + struct scsi_cmnd *sc = fsp->cmd; + struct fc_lport *lp = fsp->lp; + struct fcoe_dev_stats *stats; + struct fc_frame_header *fh; + size_t start_offset; + size_t offset; + u32 crc; + u32 copy_len = 0; + size_t len; + void *buf; + struct scatterlist *sg; + size_t remaining; + + fh = fc_frame_header_get(fp); + offset = ntohl(fh->fh_parm_offset); + start_offset = offset; + len = fr_len(fp) - sizeof(*fh); + buf = fc_frame_payload_get(fp, 0); + + if (offset + len > fsp->data_len) { + /* + * this should never happen + */ + if ((fr_flags(fp) & FCPHF_CRC_UNCHECKED) && + fc_frame_crc_check(fp)) + goto crc_err; + FC_DEBUG_FCP("data received past end. len %zx offset %zx " + "data_len %x\n", len, offset, fsp->data_len); + fc_fcp_retry_cmd(fsp); + return; + } + if (offset != fsp->xfer_len) + fsp->state |= FC_SRB_DISCONTIG; + + crc = 0; + if (fr_flags(fp) & FCPHF_CRC_UNCHECKED) + crc = crc32(~0, (u8 *) fh, sizeof(*fh)); + + sg = scsi_sglist(sc); + remaining = len; + + while (remaining > 0 && sg) { + size_t off; + void *page_addr; + size_t sg_bytes; + + if (offset >= sg->length) { + offset -= sg->length; + sg = sg_next(sg); + continue; + } + sg_bytes = min(remaining, sg->length - offset); + + /* + * The scatterlist item may be bigger than PAGE_SIZE, + * but we are limited to mapping PAGE_SIZE at a time. + */ + off = offset + sg->offset; + sg_bytes = min(sg_bytes, (size_t) + (PAGE_SIZE - (off & ~PAGE_MASK))); + page_addr = kmap_atomic(sg_page(sg) + (off >> PAGE_SHIFT), + KM_SOFTIRQ0); + if (!page_addr) + break; /* XXX panic? */ + + if (fr_flags(fp) & FCPHF_CRC_UNCHECKED) + crc = crc32(crc, buf, sg_bytes); + memcpy((char *)page_addr + (off & ~PAGE_MASK), buf, + sg_bytes); + + kunmap_atomic(page_addr, KM_SOFTIRQ0); + buf += sg_bytes; + offset += sg_bytes; + remaining -= sg_bytes; + copy_len += sg_bytes; + } + + if (fr_flags(fp) & FCPHF_CRC_UNCHECKED) { + buf = fc_frame_payload_get(fp, 0); + if (len % 4) { + crc = crc32(crc, buf + len, 4 - (len % 4)); + len += 4 - (len % 4); + } + + if (~crc != le32_to_cpu(fr_crc(fp))) { +crc_err: + stats = lp->dev_stats[smp_processor_id()]; + stats->ErrorFrames++; + if (stats->InvalidCRCCount++ < 5) + FC_DBG("CRC error on data frame\n"); + /* + * Assume the frame is total garbage. + * We may have copied it over the good part + * of the buffer. + * If so, we need to retry the entire operation. + * Otherwise, ignore it. + */ + if (fsp->state & FC_SRB_DISCONTIG) + fc_fcp_retry_cmd(fsp); + return; + } + } + + if (fsp->xfer_contig_end == start_offset) + fsp->xfer_contig_end += copy_len; + fsp->xfer_len += copy_len; + + /* + * In the very rare event that this data arrived after the response + * and completes the transfer, call the completion handler. + */ + if (unlikely(fsp->state & FC_SRB_RCV_STATUS) && + fsp->xfer_len == fsp->data_len - fsp->scsi_resid) + fc_fcp_complete_locked(fsp); +} + +/* + * fc_fcp_send_data - Send SCSI data to target. + * @fsp: ptr to fc_fcp_pkt + * @sp: ptr to this sequence + * @offset: starting offset for this data request + * @seq_blen: the burst length for this data request + * + * Called after receiving a Transfer Ready data descriptor. + * if LLD is capable of seq offload then send down seq_blen + * size of data in single frame, otherwise send multiple FC + * frames of max FC frame payload supported by target port. + * + * Returns : 0 for success. + */ +static int fc_fcp_send_data(struct fc_fcp_pkt *fsp, struct fc_seq *seq, + size_t offset, size_t seq_blen) +{ + struct fc_exch *ep; + struct scsi_cmnd *sc; + struct scatterlist *sg; + struct fc_frame *fp = NULL; + struct fc_lport *lp = fsp->lp; + size_t remaining; + size_t t_blen; + size_t tlen; + size_t sg_bytes; + size_t frame_offset, fh_parm_offset; + int error; + void *data = NULL; + void *page_addr; + int using_sg = lp->sg_supp; + u32 f_ctl; + + WARN_ON(seq_blen <= 0); + if (unlikely(offset + seq_blen > fsp->data_len)) { + /* this should never happen */ + FC_DEBUG_FCP("xfer-ready past end. seq_blen %zx offset %zx\n", + seq_blen, offset); + fc_fcp_send_abort(fsp); + return 0; + } else if (offset != fsp->xfer_len) { + /* Out of Order Data Request - no problem, but unexpected. */ + FC_DEBUG_FCP("xfer-ready non-contiguous. " + "seq_blen %zx offset %zx\n", seq_blen, offset); + } + + /* + * if LLD is capable of seq_offload then set transport + * burst length (t_blen) to seq_blen, otherwise set t_blen + * to max FC frame payload previously set in fsp->max_payload. + */ + t_blen = lp->seq_offload ? seq_blen : fsp->max_payload; + WARN_ON(t_blen < FC_MIN_MAX_PAYLOAD); + if (t_blen > 512) + t_blen &= ~(512 - 1); /* round down to block size */ + WARN_ON(t_blen < FC_MIN_MAX_PAYLOAD); /* won't go below 256 */ + sc = fsp->cmd; + + remaining = seq_blen; + fh_parm_offset = frame_offset = offset; + tlen = 0; + seq = lp->tt.seq_start_next(seq); + f_ctl = FC_FC_REL_OFF; + WARN_ON(!seq); + + /* + * If a get_page()/put_page() will fail, don't use sg lists + * in the fc_frame structure. + * + * The put_page() may be long after the I/O has completed + * in the case of FCoE, since the network driver does it + * via free_skb(). See the test in free_pages_check(). + * + * Test this case with 'dd /dev/st0 bs=64k'. + */ + if (using_sg) { + for (sg = scsi_sglist(sc); sg; sg = sg_next(sg)) { + if (page_count(sg_page(sg)) == 0 || + (sg_page(sg)->flags & (1 << PG_lru | + 1 << PG_private | + 1 << PG_locked | + 1 << PG_active | + 1 << PG_slab | + 1 << PG_swapcache | + 1 << PG_writeback | + 1 << PG_reserved | + 1 << PG_buddy))) { + using_sg = 0; + break; + } + } + } + sg = scsi_sglist(sc); + + while (remaining > 0 && sg) { + if (offset >= sg->length) { + offset -= sg->length; + sg = sg_next(sg); + continue; + } + if (!fp) { + tlen = min(t_blen, remaining); + + /* + * TODO. Temporary workaround. fc_seq_send() can't + * handle odd lengths in non-linear skbs. + * This will be the final fragment only. + */ + if (tlen % 4) + using_sg = 0; + if (using_sg) { + fp = _fc_frame_alloc(lp, 0); + if (!fp) + return -ENOMEM; + } else { + fp = fc_frame_alloc(lp, tlen); + if (!fp) + return -ENOMEM; + + data = (void *)(fr_hdr(fp)) + + sizeof(struct fc_frame_header); + } + fh_parm_offset = frame_offset; + fr_max_payload(fp) = fsp->max_payload; + } + sg_bytes = min(tlen, sg->length - offset); + if (using_sg) { + WARN_ON(skb_shinfo(fp_skb(fp))->nr_frags > + FC_FRAME_SG_LEN); + get_page(sg_page(sg)); + skb_fill_page_desc(fp_skb(fp), + skb_shinfo(fp_skb(fp))->nr_frags, + sg_page(sg), sg->offset + offset, + sg_bytes); + fp_skb(fp)->data_len += sg_bytes; + fr_len(fp) += sg_bytes; + fp_skb(fp)->truesize += PAGE_SIZE; + } else { + size_t off = offset + sg->offset; + + /* + * The scatterlist item may be bigger than PAGE_SIZE, + * but we must not cross pages inside the kmap. + */ + sg_bytes = min(sg_bytes, (size_t) (PAGE_SIZE - + (off & ~PAGE_MASK))); + page_addr = kmap_atomic(sg_page(sg) + + (off >> PAGE_SHIFT), + KM_SOFTIRQ0); + memcpy(data, (char *)page_addr + (off & ~PAGE_MASK), + sg_bytes); + kunmap_atomic(page_addr, KM_SOFTIRQ0); + data += sg_bytes; + } + offset += sg_bytes; + frame_offset += sg_bytes; + tlen -= sg_bytes; + remaining -= sg_bytes; + + if (tlen) + continue; + + /* + * Send sequence with transfer sequence initiative in case + * this is last FCP frame of the sequence. + */ + if (remaining == 0) + f_ctl |= FC_FC_SEQ_INIT | FC_FC_END_SEQ; + + ep = fc_seq_exch(seq); + fc_fill_fc_hdr(fp, FC_RCTL_DD_SOL_DATA, ep->did, ep->sid, + FC_TYPE_FCP, f_ctl, fh_parm_offset); + + /* + * send fragment using for a sequence. + */ + error = lp->tt.seq_send(lp, seq, fp); + if (error) { + WARN_ON(1); /* send error should be rare */ + fc_fcp_retry_cmd(fsp); + return 0; + } + fp = NULL; + } + fsp->xfer_len += seq_blen; /* premature count? */ + return 0; +} + +static void fc_fcp_abts_resp(struct fc_fcp_pkt *fsp, struct fc_frame *fp) +{ + int ba_done = 1; + struct fc_ba_rjt *brp; + struct fc_frame_header *fh; + + fh = fc_frame_header_get(fp); + switch (fh->fh_r_ctl) { + case FC_RCTL_BA_ACC: + break; + case FC_RCTL_BA_RJT: + brp = fc_frame_payload_get(fp, sizeof(*brp)); + if (brp && brp->br_reason == FC_BA_RJT_LOG_ERR) + break; + /* fall thru */ + default: + /* + * we will let the command timeout + * and scsi-ml recover in this case, + * therefore cleared the ba_done flag. + */ + ba_done = 0; + } + + if (ba_done) { + fsp->state |= FC_SRB_ABORTED; + fsp->state &= ~FC_SRB_ABORT_PENDING; + + if (fsp->wait_for_comp) + complete(&fsp->tm_done); + else + fc_fcp_complete_locked(fsp); + } +} + +/* + * fc_fcp_reduce_can_queue - drop can_queue + * @lp: lport to drop queueing for + * + * If we are getting memory allocation failures, then we may + * be trying to execute too many commands. We let the running + * commands complete or timeout, then try again with a reduced + * can_queue. Eventually we will hit the point where we run + * on all reserved structs. + */ +static void fc_fcp_reduce_can_queue(struct fc_lport *lp) +{ + struct fc_fcp_internal *si = fc_get_scsi_internal(lp); + unsigned long flags; + int can_queue; + + spin_lock_irqsave(lp->host->host_lock, flags); + if (si->throttled) + goto done; + si->throttled = 1; + + can_queue = lp->host->can_queue; + can_queue >>= 1; + if (!can_queue) + can_queue = 1; + lp->host->can_queue = can_queue; + shost_printk(KERN_ERR, lp->host, "Could not allocate frame.\n" + "Reducing can_queue to %d.\n", can_queue); +done: + spin_unlock_irqrestore(lp->host->host_lock, flags); +} + +/* + * exch mgr calls this routine to process scsi + * exchanges. + * + * Return : None + * Context : called from Soft IRQ context + * can not called holding list lock + */ +static void fc_fcp_recv(struct fc_seq *seq, struct fc_frame *fp, void *arg) +{ + struct fc_fcp_pkt *fsp = (struct fc_fcp_pkt *)arg; + struct fc_lport *lp; + struct fc_frame_header *fh; + struct fcp_txrdy *dd; + u8 r_ctl; + int rc = 0; + + if (IS_ERR(fp)) + goto errout; + + fh = fc_frame_header_get(fp); + r_ctl = fh->fh_r_ctl; + lp = fsp->lp; + + if (!(lp->state & LPORT_ST_READY)) + goto out; + if (fc_fcp_lock_pkt(fsp)) + goto out; + fsp->last_pkt_time = jiffies; + + if (fh->fh_type == FC_TYPE_BLS) { + fc_fcp_abts_resp(fsp, fp); + goto unlock; + } + + if (fsp->state & (FC_SRB_ABORTED | FC_SRB_ABORT_PENDING)) + goto unlock; + + if (r_ctl == FC_RCTL_DD_DATA_DESC) { + /* + * received XFER RDY from the target + * need to send data to the target + */ + WARN_ON(fr_flags(fp) & FCPHF_CRC_UNCHECKED); + dd = fc_frame_payload_get(fp, sizeof(*dd)); + WARN_ON(!dd); + + rc = fc_fcp_send_data(fsp, seq, + (size_t) ntohl(dd->ft_data_ro), + (size_t) ntohl(dd->ft_burst_len)); + if (!rc) + seq->rec_data = fsp->xfer_len; + else if (rc == -ENOMEM) + fsp->state |= FC_SRB_NOMEM; + } else if (r_ctl == FC_RCTL_DD_SOL_DATA) { + /* + * received a DATA frame + * next we will copy the data to the system buffer + */ + WARN_ON(fr_len(fp) < sizeof(*fh)); /* len may be 0 */ + fc_fcp_recv_data(fsp, fp); + seq->rec_data = fsp->xfer_contig_end; + } else if (r_ctl == FC_RCTL_DD_CMD_STATUS) { + WARN_ON(fr_flags(fp) & FCPHF_CRC_UNCHECKED); + + fc_fcp_resp(fsp, fp); + } else { + FC_DBG("unexpected frame. r_ctl %x\n", r_ctl); + } +unlock: + fc_fcp_unlock_pkt(fsp); +out: + fc_frame_free(fp); +errout: + if (IS_ERR(fp)) + fc_fcp_error(fsp, fp); + else if (rc == -ENOMEM) + fc_fcp_reduce_can_queue(lp); +} + +static void fc_fcp_resp(struct fc_fcp_pkt *fsp, struct fc_frame *fp) +{ + struct fc_frame_header *fh; + struct fcp_resp *fc_rp; + struct fcp_resp_ext *rp_ex; + struct fcp_resp_rsp_info *fc_rp_info; + u32 plen; + u32 expected_len; + u32 respl = 0; + u32 snsl = 0; + u8 flags = 0; + + plen = fr_len(fp); + fh = (struct fc_frame_header *)fr_hdr(fp); + if (unlikely(plen < sizeof(*fh) + sizeof(*fc_rp))) + goto len_err; + plen -= sizeof(*fh); + fc_rp = (struct fcp_resp *)(fh + 1); + fsp->cdb_status = fc_rp->fr_status; + flags = fc_rp->fr_flags; + fsp->scsi_comp_flags = flags; + expected_len = fsp->data_len; + + if (unlikely((flags & ~FCP_CONF_REQ) || fc_rp->fr_status)) { + rp_ex = (void *)(fc_rp + 1); + if (flags & (FCP_RSP_LEN_VAL | FCP_SNS_LEN_VAL)) { + if (plen < sizeof(*fc_rp) + sizeof(*rp_ex)) + goto len_err; + fc_rp_info = (struct fcp_resp_rsp_info *)(rp_ex + 1); + if (flags & FCP_RSP_LEN_VAL) { + respl = ntohl(rp_ex->fr_rsp_len); + if (respl != sizeof(*fc_rp_info)) + goto len_err; + if (fsp->wait_for_comp) { + /* Abuse cdb_status for rsp code */ + fsp->cdb_status = fc_rp_info->rsp_code; + complete(&fsp->tm_done); + /* + * tmfs will not have any scsi cmd so + * exit here + */ + return; + } else + goto err; + } + if (flags & FCP_SNS_LEN_VAL) { + snsl = ntohl(rp_ex->fr_sns_len); + if (snsl > SCSI_SENSE_BUFFERSIZE) + snsl = SCSI_SENSE_BUFFERSIZE; + memcpy(fsp->cmd->sense_buffer, + (char *)fc_rp_info + respl, snsl); + } + } + if (flags & (FCP_RESID_UNDER | FCP_RESID_OVER)) { + if (plen < sizeof(*fc_rp) + sizeof(rp_ex->fr_resid)) + goto len_err; + if (flags & FCP_RESID_UNDER) { + fsp->scsi_resid = ntohl(rp_ex->fr_resid); + /* + * The cmnd->underflow is the minimum number of + * bytes that must be transfered for this + * command. Provided a sense condition is not + * present, make sure the actual amount + * transferred is at least the underflow value + * or fail. + */ + if (!(flags & FCP_SNS_LEN_VAL) && + (fc_rp->fr_status == 0) && + (scsi_bufflen(fsp->cmd) - + fsp->scsi_resid) < fsp->cmd->underflow) + goto err; + expected_len -= fsp->scsi_resid; + } else { + fsp->status_code = FC_ERROR; + } + } + } + fsp->state |= FC_SRB_RCV_STATUS; + + /* + * Check for missing or extra data frames. + */ + if (unlikely(fsp->xfer_len != expected_len)) { + if (fsp->xfer_len < expected_len) { + /* + * Some data may be queued locally, + * Wait a at least one jiffy to see if it is delivered. + * If this expires without data, we may do SRR. + */ + fc_fcp_timer_set(fsp, 2); + return; + } + fsp->status_code = FC_DATA_OVRRUN; + FC_DBG("tgt %6x xfer len %zx greater than expected len %x. " + "data len %x\n", + fsp->rport->port_id, + fsp->xfer_len, expected_len, fsp->data_len); + } + fc_fcp_complete_locked(fsp); + return; + +len_err: + FC_DBG("short FCP response. flags 0x%x len %u respl %u snsl %u\n", + flags, fr_len(fp), respl, snsl); +err: + fsp->status_code = FC_ERROR; + fc_fcp_complete_locked(fsp); +} + +/** + * fc_fcp_complete_locked - complete processing of a fcp packet + * @fsp: fcp packet + * + * This function may sleep if a timer is pending. The packet lock must be + * held, and the host lock must not be held. + */ +static void fc_fcp_complete_locked(struct fc_fcp_pkt *fsp) +{ + struct fc_lport *lp = fsp->lp; + struct fc_seq *seq; + struct fc_exch *ep; + u32 f_ctl; + + if (fsp->state & FC_SRB_ABORT_PENDING) + return; + + if (fsp->state & FC_SRB_ABORTED) { + if (!fsp->status_code) + fsp->status_code = FC_CMD_ABORTED; + } else { + /* + * Test for transport underrun, independent of response + * underrun status. + */ + if (fsp->xfer_len < fsp->data_len && !fsp->io_status && + (!(fsp->scsi_comp_flags & FCP_RESID_UNDER) || + fsp->xfer_len < fsp->data_len - fsp->scsi_resid)) { + fsp->status_code = FC_DATA_UNDRUN; + fsp->io_status = SUGGEST_RETRY << 24; + } + } + + seq = fsp->seq_ptr; + if (seq) { + fsp->seq_ptr = NULL; + if (unlikely(fsp->scsi_comp_flags & FCP_CONF_REQ)) { + struct fc_frame *conf_frame; + struct fc_seq *csp; + + csp = lp->tt.seq_start_next(seq); + conf_frame = fc_frame_alloc(fsp->lp, 0); + if (conf_frame) { + f_ctl = FC_FC_SEQ_INIT; + f_ctl |= FC_FC_LAST_SEQ | FC_FC_END_SEQ; + ep = fc_seq_exch(seq); + fc_fill_fc_hdr(conf_frame, FC_RCTL_DD_SOL_CTL, + ep->did, ep->sid, + FC_TYPE_FCP, f_ctl, 0); + lp->tt.seq_send(lp, csp, conf_frame); + } + } + lp->tt.exch_done(seq); + } + fc_io_compl(fsp); +} + +static void fc_fcp_cleanup_cmd(struct fc_fcp_pkt *fsp, int error) +{ + struct fc_lport *lp = fsp->lp; + + if (fsp->seq_ptr) { + lp->tt.exch_done(fsp->seq_ptr); + fsp->seq_ptr = NULL; + } + fsp->status_code = error; +} + +/** + * fc_fcp_cleanup_each_cmd - run fn on each active command + * @lp: logical port + * @id: target id + * @lun: lun + * @error: fsp status code + * + * If lun or id is -1, they are ignored. + */ +static void fc_fcp_cleanup_each_cmd(struct fc_lport *lp, unsigned int id, + unsigned int lun, int error) +{ + struct fc_fcp_internal *si = fc_get_scsi_internal(lp); + struct fc_fcp_pkt *fsp; + struct scsi_cmnd *sc_cmd; + unsigned long flags; + + spin_lock_irqsave(lp->host->host_lock, flags); +restart: + list_for_each_entry(fsp, &si->scsi_pkt_queue, list) { + sc_cmd = fsp->cmd; + if (id != -1 && scmd_id(sc_cmd) != id) + continue; + + if (lun != -1 && sc_cmd->device->lun != lun) + continue; + + fc_fcp_pkt_hold(fsp); + spin_unlock_irqrestore(lp->host->host_lock, flags); + + if (!fc_fcp_lock_pkt(fsp)) { + fc_fcp_cleanup_cmd(fsp, error); + fc_io_compl(fsp); + fc_fcp_unlock_pkt(fsp); + } + + fc_fcp_pkt_release(fsp); + spin_lock_irqsave(lp->host->host_lock, flags); + /* + * while we dropped the lock multiple pkts could + * have been released, so we have to start over. + */ + goto restart; + } + spin_unlock_irqrestore(lp->host->host_lock, flags); +} + +static void fc_fcp_abort_io(struct fc_lport *lp) +{ + fc_fcp_cleanup_each_cmd(lp, -1, -1, FC_HRD_ERROR); +} + +/** + * fc_fcp_pkt_send - send a fcp packet to the lower level. + * @lp: fc lport + * @fsp: fc packet. + * + * This is called by upper layer protocol. + * Return : zero for success and -1 for failure + * Context : called from queuecommand which can be called from process + * or scsi soft irq. + * Locks : called with the host lock and irqs disabled. + */ +static int fc_fcp_pkt_send(struct fc_lport *lp, struct fc_fcp_pkt *fsp) +{ + struct fc_fcp_internal *si = fc_get_scsi_internal(lp); + int rc; + + fsp->cmd->SCp.ptr = (char *)fsp; + fsp->cdb_cmd.fc_dl = htonl(fsp->data_len); + fsp->cdb_cmd.fc_flags = fsp->req_flags & ~FCP_CFL_LEN_MASK; + + int_to_scsilun(fsp->cmd->device->lun, + (struct scsi_lun *)fsp->cdb_cmd.fc_lun); + memcpy(fsp->cdb_cmd.fc_cdb, fsp->cmd->cmnd, fsp->cmd->cmd_len); + list_add_tail(&fsp->list, &si->scsi_pkt_queue); + + spin_unlock_irq(lp->host->host_lock); + rc = lp->tt.fcp_cmd_send(lp, fsp, fc_fcp_recv); + spin_lock_irq(lp->host->host_lock); + if (rc) + list_del(&fsp->list); + + return rc; +} + +static int fc_fcp_cmd_send(struct fc_lport *lp, struct fc_fcp_pkt *fsp, + void (*resp)(struct fc_seq *, + struct fc_frame *fp, + void *arg)) +{ + struct fc_frame *fp; + struct fc_seq *seq; + struct fc_rport *rport; + struct fc_rport_libfc_priv *rp; + const size_t len = sizeof(fsp->cdb_cmd); + int rc = 0; + + if (fc_fcp_lock_pkt(fsp)) + return 0; + + fp = fc_frame_alloc(lp, sizeof(fsp->cdb_cmd)); + if (!fp) { + rc = -1; + goto unlock; + } + + memcpy(fc_frame_payload_get(fp, len), &fsp->cdb_cmd, len); + fr_cmd(fp) = fsp->cmd; + rport = fsp->rport; + fsp->max_payload = rport->maxframe_size; + rp = rport->dd_data; + + fc_fill_fc_hdr(fp, FC_RCTL_DD_UNSOL_CMD, rport->port_id, + fc_host_port_id(rp->local_port->host), FC_TYPE_FCP, + FC_FC_FIRST_SEQ | FC_FC_END_SEQ | FC_FC_SEQ_INIT, 0); + + seq = lp->tt.exch_seq_send(lp, fp, resp, fc_fcp_pkt_destroy, fsp, 0); + if (!seq) { + fc_frame_free(fp); + rc = -1; + goto unlock; + } + fsp->last_pkt_time = jiffies; + fsp->seq_ptr = seq; + fc_fcp_pkt_hold(fsp); /* hold for fc_fcp_pkt_destroy */ + + setup_timer(&fsp->timer, fc_fcp_timeout, (unsigned long)fsp); + fc_fcp_timer_set(fsp, + (fsp->tgt_flags & FC_RP_FLAGS_REC_SUPPORTED) ? + FC_SCSI_REC_TOV : FC_SCSI_ER_TIMEOUT); +unlock: + fc_fcp_unlock_pkt(fsp); + return rc; +} + +/* + * transport error handler + */ +static void fc_fcp_error(struct fc_fcp_pkt *fsp, struct fc_frame *fp) +{ + int error = PTR_ERR(fp); + + if (fc_fcp_lock_pkt(fsp)) + return; + + switch (error) { + case -FC_EX_CLOSED: + fc_fcp_retry_cmd(fsp); + goto unlock; + default: + FC_DBG("unknown error %ld\n", PTR_ERR(fp)); + } + /* + * clear abort pending, because the lower layer + * decided to force completion. + */ + fsp->state &= ~FC_SRB_ABORT_PENDING; + fsp->status_code = FC_CMD_PLOGO; + fc_fcp_complete_locked(fsp); +unlock: + fc_fcp_unlock_pkt(fsp); +} + +/* + * Scsi abort handler- calls to send an abort + * and then wait for abort completion + */ +static int fc_fcp_pkt_abort(struct fc_lport *lp, struct fc_fcp_pkt *fsp) +{ + int rc = FAILED; + + if (fc_fcp_send_abort(fsp)) + return FAILED; + + init_completion(&fsp->tm_done); + fsp->wait_for_comp = 1; + + spin_unlock_bh(&fsp->scsi_pkt_lock); + rc = wait_for_completion_timeout(&fsp->tm_done, FC_SCSI_TM_TOV); + spin_lock_bh(&fsp->scsi_pkt_lock); + fsp->wait_for_comp = 0; + + if (!rc) { + FC_DBG("target abort cmd failed\n"); + rc = FAILED; + } else if (fsp->state & FC_SRB_ABORTED) { + FC_DBG("target abort cmd passed\n"); + rc = SUCCESS; + fc_fcp_complete_locked(fsp); + } + + return rc; +} + +/* + * Retry LUN reset after resource allocation failed. + */ +static void fc_lun_reset_send(unsigned long data) +{ + struct fc_fcp_pkt *fsp = (struct fc_fcp_pkt *)data; + struct fc_lport *lp = fsp->lp; + if (lp->tt.fcp_cmd_send(lp, fsp, fc_tm_done)) { + if (fsp->recov_retry++ >= FC_MAX_RECOV_RETRY) + return; + if (fc_fcp_lock_pkt(fsp)) + return; + setup_timer(&fsp->timer, fc_lun_reset_send, (unsigned long)fsp); + fc_fcp_timer_set(fsp, FC_SCSI_REC_TOV); + fc_fcp_unlock_pkt(fsp); + } +} + +/* + * Scsi device reset handler- send a LUN RESET to the device + * and wait for reset reply + */ +static int fc_lun_reset(struct fc_lport *lp, struct fc_fcp_pkt *fsp, + unsigned int id, unsigned int lun) +{ + int rc; + + fsp->cdb_cmd.fc_dl = htonl(fsp->data_len); + fsp->cdb_cmd.fc_tm_flags = FCP_TMF_LUN_RESET; + int_to_scsilun(lun, (struct scsi_lun *)fsp->cdb_cmd.fc_lun); + + fsp->wait_for_comp = 1; + init_completion(&fsp->tm_done); + + fc_lun_reset_send((unsigned long)fsp); + + /* + * wait for completion of reset + * after that make sure all commands are terminated + */ + rc = wait_for_completion_timeout(&fsp->tm_done, FC_SCSI_TM_TOV); + + spin_lock_bh(&fsp->scsi_pkt_lock); + fsp->state |= FC_SRB_COMPL; + spin_unlock_bh(&fsp->scsi_pkt_lock); + + del_timer_sync(&fsp->timer); + + spin_lock_bh(&fsp->scsi_pkt_lock); + if (fsp->seq_ptr) { + lp->tt.exch_done(fsp->seq_ptr); + fsp->seq_ptr = NULL; + } + fsp->wait_for_comp = 0; + spin_unlock_bh(&fsp->scsi_pkt_lock); + + if (!rc) { + FC_DBG("lun reset failed\n"); + return FAILED; + } + + /* cdb_status holds the tmf's rsp code */ + if (fsp->cdb_status != FCP_TMF_CMPL) + return FAILED; + + FC_DBG("lun reset to lun %u completed\n", lun); + fc_fcp_cleanup_each_cmd(lp, id, lun, FC_CMD_ABORTED); + return SUCCESS; +} + +/* + * Task Managment response handler + */ +static void fc_tm_done(struct fc_seq *seq, struct fc_frame *fp, void *arg) +{ + struct fc_fcp_pkt *fsp = arg; + struct fc_frame_header *fh; + + if (IS_ERR(fp)) { + /* + * If there is an error just let it timeout or wait + * for TMF to be aborted if it timedout. + * + * scsi-eh will escalate for when either happens. + */ + return; + } + + if (fc_fcp_lock_pkt(fsp)) + return; + + /* + * raced with eh timeout handler. + */ + if (!fsp->seq_ptr || !fsp->wait_for_comp) { + spin_unlock_bh(&fsp->scsi_pkt_lock); + return; + } + + fh = fc_frame_header_get(fp); + if (fh->fh_type != FC_TYPE_BLS) + fc_fcp_resp(fsp, fp); + fsp->seq_ptr = NULL; + fsp->lp->tt.exch_done(seq); + fc_frame_free(fp); + fc_fcp_unlock_pkt(fsp); +} + +static void fc_fcp_cleanup(struct fc_lport *lp) +{ + fc_fcp_cleanup_each_cmd(lp, -1, -1, FC_ERROR); +} + +/* + * fc_fcp_timeout: called by OS timer function. + * + * The timer has been inactivated and must be reactivated if desired + * using fc_fcp_timer_set(). + * + * Algorithm: + * + * If REC is supported, just issue it, and return. The REC exchange will + * complete or time out, and recovery can continue at that point. + * + * Otherwise, if the response has been received without all the data, + * it has been ER_TIMEOUT since the response was received. + * + * If the response has not been received, + * we see if data was received recently. If it has been, we continue waiting, + * otherwise, we abort the command. + */ +static void fc_fcp_timeout(unsigned long data) +{ + struct fc_fcp_pkt *fsp = (struct fc_fcp_pkt *)data; + struct fc_rport *rport = fsp->rport; + struct fc_rport_libfc_priv *rp = rport->dd_data; + + if (fc_fcp_lock_pkt(fsp)) + return; + + if (fsp->cdb_cmd.fc_tm_flags) + goto unlock; + + fsp->state |= FC_SRB_FCP_PROCESSING_TMO; + + if (rp->flags & FC_RP_FLAGS_REC_SUPPORTED) + fc_fcp_rec(fsp); + else if (time_after_eq(fsp->last_pkt_time + (FC_SCSI_ER_TIMEOUT / 2), + jiffies)) + fc_fcp_timer_set(fsp, FC_SCSI_ER_TIMEOUT); + else if (fsp->state & FC_SRB_RCV_STATUS) + fc_fcp_complete_locked(fsp); + else + fc_timeout_error(fsp); + fsp->state &= ~FC_SRB_FCP_PROCESSING_TMO; +unlock: + fc_fcp_unlock_pkt(fsp); +} + +/* + * Send a REC ELS request + */ +static void fc_fcp_rec(struct fc_fcp_pkt *fsp) +{ + struct fc_lport *lp; + struct fc_frame *fp; + struct fc_rport *rport; + struct fc_rport_libfc_priv *rp; + + lp = fsp->lp; + rport = fsp->rport; + rp = rport->dd_data; + if (!fsp->seq_ptr || rp->rp_state != RPORT_ST_READY) { + fsp->status_code = FC_HRD_ERROR; + fsp->io_status = SUGGEST_RETRY << 24; + fc_fcp_complete_locked(fsp); + return; + } + fp = fc_frame_alloc(lp, sizeof(struct fc_els_rec)); + if (!fp) + goto retry; + + fr_seq(fp) = fsp->seq_ptr; + fc_fill_fc_hdr(fp, FC_RCTL_ELS_REQ, rport->port_id, + fc_host_port_id(rp->local_port->host), FC_TYPE_ELS, + FC_FC_FIRST_SEQ | FC_FC_END_SEQ | FC_FC_SEQ_INIT, 0); + if (lp->tt.elsct_send(lp, rport, fp, ELS_REC, fc_fcp_rec_resp, + fsp, jiffies_to_msecs(FC_SCSI_REC_TOV))) { + fc_fcp_pkt_hold(fsp); /* hold while REC outstanding */ + return; + } + fc_frame_free(fp); +retry: + if (fsp->recov_retry++ < FC_MAX_RECOV_RETRY) + fc_fcp_timer_set(fsp, FC_SCSI_REC_TOV); + else + fc_timeout_error(fsp); +} + +/* + * Receive handler for REC ELS frame + * if it is a reject then let the scsi layer to handle + * the timeout. if it is a LS_ACC then if the io was not completed + * then set the timeout and return otherwise complete the exchange + * and tell the scsi layer to restart the I/O. + */ +static void fc_fcp_rec_resp(struct fc_seq *seq, struct fc_frame *fp, void *arg) +{ + struct fc_fcp_pkt *fsp = (struct fc_fcp_pkt *)arg; + struct fc_els_rec_acc *recp; + struct fc_els_ls_rjt *rjt; + u32 e_stat; + u8 opcode; + u32 offset; + enum dma_data_direction data_dir; + enum fc_rctl r_ctl; + struct fc_rport_libfc_priv *rp; + + if (IS_ERR(fp)) { + fc_fcp_rec_error(fsp, fp); + return; + } + + if (fc_fcp_lock_pkt(fsp)) + goto out; + + fsp->recov_retry = 0; + opcode = fc_frame_payload_op(fp); + if (opcode == ELS_LS_RJT) { + rjt = fc_frame_payload_get(fp, sizeof(*rjt)); + switch (rjt->er_reason) { + default: + FC_DEBUG_FCP("device %x unexpected REC reject " + "reason %d expl %d\n", + fsp->rport->port_id, rjt->er_reason, + rjt->er_explan); + /* fall through */ + case ELS_RJT_UNSUP: + FC_DEBUG_FCP("device does not support REC\n"); + rp = fsp->rport->dd_data; + /* + * if we do not spport RECs or got some bogus + * reason then resetup timer so we check for + * making progress. + */ + rp->flags &= ~FC_RP_FLAGS_REC_SUPPORTED; + fc_fcp_timer_set(fsp, FC_SCSI_ER_TIMEOUT); + break; + case ELS_RJT_LOGIC: + case ELS_RJT_UNAB: + /* + * If no data transfer, the command frame got dropped + * so we just retry. If data was transferred, we + * lost the response but the target has no record, + * so we abort and retry. + */ + if (rjt->er_explan == ELS_EXPL_OXID_RXID && + fsp->xfer_len == 0) { + fc_fcp_retry_cmd(fsp); + break; + } + fc_timeout_error(fsp); + break; + } + } else if (opcode == ELS_LS_ACC) { + if (fsp->state & FC_SRB_ABORTED) + goto unlock_out; + + data_dir = fsp->cmd->sc_data_direction; + recp = fc_frame_payload_get(fp, sizeof(*recp)); + offset = ntohl(recp->reca_fc4value); + e_stat = ntohl(recp->reca_e_stat); + + if (e_stat & ESB_ST_COMPLETE) { + + /* + * The exchange is complete. + * + * For output, we must've lost the response. + * For input, all data must've been sent. + * We lost may have lost the response + * (and a confirmation was requested) and maybe + * some data. + * + * If all data received, send SRR + * asking for response. If partial data received, + * or gaps, SRR requests data at start of gap. + * Recovery via SRR relies on in-order-delivery. + */ + if (data_dir == DMA_TO_DEVICE) { + r_ctl = FC_RCTL_DD_CMD_STATUS; + } else if (fsp->xfer_contig_end == offset) { + r_ctl = FC_RCTL_DD_CMD_STATUS; + } else { + offset = fsp->xfer_contig_end; + r_ctl = FC_RCTL_DD_SOL_DATA; + } + fc_fcp_srr(fsp, r_ctl, offset); + } else if (e_stat & ESB_ST_SEQ_INIT) { + + /* + * The remote port has the initiative, so just + * keep waiting for it to complete. + */ + fc_fcp_timer_set(fsp, FC_SCSI_REC_TOV); + } else { + + /* + * The exchange is incomplete, we have seq. initiative. + * Lost response with requested confirmation, + * lost confirmation, lost transfer ready or + * lost write data. + * + * For output, if not all data was received, ask + * for transfer ready to be repeated. + * + * If we received or sent all the data, send SRR to + * request response. + * + * If we lost a response, we may have lost some read + * data as well. + */ + r_ctl = FC_RCTL_DD_SOL_DATA; + if (data_dir == DMA_TO_DEVICE) { + r_ctl = FC_RCTL_DD_CMD_STATUS; + if (offset < fsp->data_len) + r_ctl = FC_RCTL_DD_DATA_DESC; + } else if (offset == fsp->xfer_contig_end) { + r_ctl = FC_RCTL_DD_CMD_STATUS; + } else if (fsp->xfer_contig_end < offset) { + offset = fsp->xfer_contig_end; + } + fc_fcp_srr(fsp, r_ctl, offset); + } + } +unlock_out: + fc_fcp_unlock_pkt(fsp); +out: + fc_fcp_pkt_release(fsp); /* drop hold for outstanding REC */ + fc_frame_free(fp); +} + +/* + * Handle error response or timeout for REC exchange. + */ +static void fc_fcp_rec_error(struct fc_fcp_pkt *fsp, struct fc_frame *fp) +{ + int error = PTR_ERR(fp); + + if (fc_fcp_lock_pkt(fsp)) + goto out; + + switch (error) { + case -FC_EX_CLOSED: + fc_fcp_retry_cmd(fsp); + break; + + default: + FC_DBG("REC %p fid %x error unexpected error %d\n", + fsp, fsp->rport->port_id, error); + fsp->status_code = FC_CMD_PLOGO; + /* fall through */ + + case -FC_EX_TIMEOUT: + /* + * Assume REC or LS_ACC was lost. + * The exchange manager will have aborted REC, so retry. + */ + FC_DBG("REC fid %x error error %d retry %d/%d\n", + fsp->rport->port_id, error, fsp->recov_retry, + FC_MAX_RECOV_RETRY); + if (fsp->recov_retry++ < FC_MAX_RECOV_RETRY) + fc_fcp_rec(fsp); + else + fc_timeout_error(fsp); + break; + } + fc_fcp_unlock_pkt(fsp); +out: + fc_fcp_pkt_release(fsp); /* drop hold for outstanding REC */ +} + +/* + * Time out error routine: + * abort's the I/O close the exchange and + * send completion notification to scsi layer + */ +static void fc_timeout_error(struct fc_fcp_pkt *fsp) +{ + fsp->status_code = FC_CMD_TIME_OUT; + fsp->cdb_status = 0; + fsp->io_status = 0; + /* + * if this fails then we let the scsi command timer fire and + * scsi-ml escalate. + */ + fc_fcp_send_abort(fsp); +} + +/* + * Sequence retransmission request. + * This is called after receiving status but insufficient data, or + * when expecting status but the request has timed out. + */ +static void fc_fcp_srr(struct fc_fcp_pkt *fsp, enum fc_rctl r_ctl, u32 offset) +{ + struct fc_lport *lp = fsp->lp; + struct fc_rport *rport; + struct fc_rport_libfc_priv *rp; + struct fc_exch *ep = fc_seq_exch(fsp->seq_ptr); + struct fc_seq *seq; + struct fcp_srr *srr; + struct fc_frame *fp; + u8 cdb_op; + + rport = fsp->rport; + rp = rport->dd_data; + cdb_op = fsp->cdb_cmd.fc_cdb[0]; + + if (!(rp->flags & FC_RP_FLAGS_RETRY) || rp->rp_state != RPORT_ST_READY) + goto retry; /* shouldn't happen */ + fp = fc_frame_alloc(lp, sizeof(*srr)); + if (!fp) + goto retry; + + srr = fc_frame_payload_get(fp, sizeof(*srr)); + memset(srr, 0, sizeof(*srr)); + srr->srr_op = ELS_SRR; + srr->srr_ox_id = htons(ep->oxid); + srr->srr_rx_id = htons(ep->rxid); + srr->srr_r_ctl = r_ctl; + srr->srr_rel_off = htonl(offset); + + fc_fill_fc_hdr(fp, FC_RCTL_ELS4_REQ, rport->port_id, + fc_host_port_id(rp->local_port->host), FC_TYPE_FCP, + FC_FC_FIRST_SEQ | FC_FC_END_SEQ | FC_FC_SEQ_INIT, 0); + + seq = lp->tt.exch_seq_send(lp, fp, fc_fcp_srr_resp, NULL, + fsp, jiffies_to_msecs(FC_SCSI_REC_TOV)); + if (!seq) { + fc_frame_free(fp); + goto retry; + } + fsp->recov_seq = seq; + fsp->xfer_len = offset; + fsp->xfer_contig_end = offset; + fsp->state &= ~FC_SRB_RCV_STATUS; + fc_fcp_pkt_hold(fsp); /* hold for outstanding SRR */ + return; +retry: + fc_fcp_retry_cmd(fsp); +} + +/* + * Handle response from SRR. + */ +static void fc_fcp_srr_resp(struct fc_seq *seq, struct fc_frame *fp, void *arg) +{ + struct fc_fcp_pkt *fsp = arg; + struct fc_frame_header *fh; + + if (IS_ERR(fp)) { + fc_fcp_srr_error(fsp, fp); + return; + } + + if (fc_fcp_lock_pkt(fsp)) + goto out; + + fh = fc_frame_header_get(fp); + /* + * BUG? fc_fcp_srr_error calls exch_done which would release + * the ep. But if fc_fcp_srr_error had got -FC_EX_TIMEOUT, + * then fc_exch_timeout would be sending an abort. The exch_done + * call by fc_fcp_srr_error would prevent fc_exch.c from seeing + * an abort response though. + */ + if (fh->fh_type == FC_TYPE_BLS) { + fc_fcp_unlock_pkt(fsp); + return; + } + + fsp->recov_seq = NULL; + switch (fc_frame_payload_op(fp)) { + case ELS_LS_ACC: + fsp->recov_retry = 0; + fc_fcp_timer_set(fsp, FC_SCSI_REC_TOV); + break; + case ELS_LS_RJT: + default: + fc_timeout_error(fsp); + break; + } + fc_fcp_unlock_pkt(fsp); + fsp->lp->tt.exch_done(seq); +out: + fc_frame_free(fp); + fc_fcp_pkt_release(fsp); /* drop hold for outstanding SRR */ +} + +static void fc_fcp_srr_error(struct fc_fcp_pkt *fsp, struct fc_frame *fp) +{ + if (fc_fcp_lock_pkt(fsp)) + goto out; + fsp->lp->tt.exch_done(fsp->recov_seq); + fsp->recov_seq = NULL; + switch (PTR_ERR(fp)) { + case -FC_EX_TIMEOUT: + if (fsp->recov_retry++ < FC_MAX_RECOV_RETRY) + fc_fcp_rec(fsp); + else + fc_timeout_error(fsp); + break; + case -FC_EX_CLOSED: /* e.g., link failure */ + /* fall through */ + default: + fc_fcp_retry_cmd(fsp); + break; + } + fc_fcp_unlock_pkt(fsp); +out: + fc_fcp_pkt_release(fsp); /* drop hold for outstanding SRR */ +} + +static inline int fc_fcp_lport_queue_ready(struct fc_lport *lp) +{ + /* lock ? */ + return (lp->state == LPORT_ST_READY) && (lp->link_status & FC_LINK_UP); +} + +/** + * fc_queuecommand - The queuecommand function of the scsi template + * @cmd: struct scsi_cmnd to be executed + * @done: Callback function to be called when cmd is completed + * + * this is the i/o strategy routine, called by the scsi layer + * this routine is called with holding the host_lock. + */ +int fc_queuecommand(struct scsi_cmnd *sc_cmd, void (*done)(struct scsi_cmnd *)) +{ + struct fc_lport *lp; + struct fc_rport *rport = starget_to_rport(scsi_target(sc_cmd->device)); + struct fc_fcp_pkt *fsp; + struct fc_rport_libfc_priv *rp; + int rval; + int rc = 0; + struct fcoe_dev_stats *stats; + + lp = shost_priv(sc_cmd->device->host); + + rval = fc_remote_port_chkready(rport); + if (rval) { + sc_cmd->result = rval; + done(sc_cmd); + goto out; + } + + if (!*(struct fc_remote_port **)rport->dd_data) { + /* + * rport is transitioning from blocked/deleted to + * online + */ + sc_cmd->result = DID_IMM_RETRY << 16; + done(sc_cmd); + goto out; + } + + rp = rport->dd_data; + + if (!fc_fcp_lport_queue_ready(lp)) { + rc = SCSI_MLQUEUE_HOST_BUSY; + goto out; + } + + fsp = fc_fcp_pkt_alloc(lp, GFP_ATOMIC); + if (fsp == NULL) { + rc = SCSI_MLQUEUE_HOST_BUSY; + goto out; + } + + /* + * build the libfc request pkt + */ + fsp->cmd = sc_cmd; /* save the cmd */ + fsp->lp = lp; /* save the softc ptr */ + fsp->rport = rport; /* set the remote port ptr */ + sc_cmd->scsi_done = done; + + /* + * set up the transfer length + */ + fsp->data_len = scsi_bufflen(sc_cmd); + fsp->xfer_len = 0; + + /* + * setup the data direction + */ + stats = lp->dev_stats[smp_processor_id()]; + if (sc_cmd->sc_data_direction == DMA_FROM_DEVICE) { + fsp->req_flags = FC_SRB_READ; + stats->InputRequests++; + stats->InputMegabytes = fsp->data_len; + } else if (sc_cmd->sc_data_direction == DMA_TO_DEVICE) { + fsp->req_flags = FC_SRB_WRITE; + stats->OutputRequests++; + stats->OutputMegabytes = fsp->data_len; + } else { + fsp->req_flags = 0; + stats->ControlRequests++; + } + + fsp->tgt_flags = rp->flags; + + init_timer(&fsp->timer); + fsp->timer.data = (unsigned long)fsp; + + /* + * send it to the lower layer + * if we get -1 return then put the request in the pending + * queue. + */ + rval = fc_fcp_pkt_send(lp, fsp); + if (rval != 0) { + fsp->state = FC_SRB_FREE; + fc_fcp_pkt_release(fsp); + rc = SCSI_MLQUEUE_HOST_BUSY; + } +out: + return rc; +} +EXPORT_SYMBOL(fc_queuecommand); + +/** + * fc_io_compl - Handle responses for completed commands + * @fsp: scsi packet + * + * Translates a error to a Linux SCSI error. + * + * The fcp packet lock must be held when calling. + */ +static void fc_io_compl(struct fc_fcp_pkt *fsp) +{ + struct fc_fcp_internal *si; + struct scsi_cmnd *sc_cmd; + struct fc_lport *lp; + unsigned long flags; + + fsp->state |= FC_SRB_COMPL; + if (!(fsp->state & FC_SRB_FCP_PROCESSING_TMO)) { + spin_unlock_bh(&fsp->scsi_pkt_lock); + del_timer_sync(&fsp->timer); + spin_lock_bh(&fsp->scsi_pkt_lock); + } + + lp = fsp->lp; + si = fc_get_scsi_internal(lp); + spin_lock_irqsave(lp->host->host_lock, flags); + if (!fsp->cmd) { + spin_unlock_irqrestore(lp->host->host_lock, flags); + return; + } + + /* + * if a command timed out while we had to try and throttle IO + * and it is now getting cleaned up, then we are about to + * try again so clear the throttled flag incase we get more + * time outs. + */ + if (si->throttled && fsp->state & FC_SRB_NOMEM) + si->throttled = 0; + + sc_cmd = fsp->cmd; + fsp->cmd = NULL; + + if (!sc_cmd->SCp.ptr) { + spin_unlock_irqrestore(lp->host->host_lock, flags); + return; + } + + CMD_SCSI_STATUS(sc_cmd) = fsp->cdb_status; + switch (fsp->status_code) { + case FC_COMPLETE: + if (fsp->cdb_status == 0) { + /* + * good I/O status + */ + sc_cmd->result = DID_OK << 16; + if (fsp->scsi_resid) + CMD_RESID_LEN(sc_cmd) = fsp->scsi_resid; + } else if (fsp->cdb_status == QUEUE_FULL) { + struct scsi_device *tmp_sdev; + struct scsi_device *sdev = sc_cmd->device; + + shost_for_each_device(tmp_sdev, sdev->host) { + if (tmp_sdev->id != sdev->id) + continue; + + if (tmp_sdev->queue_depth > 1) { + scsi_track_queue_full(tmp_sdev, + tmp_sdev-> + queue_depth - 1); + } + } + sc_cmd->result = (DID_OK << 16) | fsp->cdb_status; + } else { + /* + * transport level I/O was ok but scsi + * has non zero status + */ + sc_cmd->result = (DID_OK << 16) | fsp->cdb_status; + } + break; + case FC_ERROR: + sc_cmd->result = DID_ERROR << 16; + break; + case FC_DATA_UNDRUN: + if (fsp->cdb_status == 0) { + /* + * scsi status is good but transport level + * underrun. for read it should be an error?? + */ + sc_cmd->result = (DID_OK << 16) | fsp->cdb_status; + } else { + /* + * scsi got underrun, this is an error + */ + CMD_RESID_LEN(sc_cmd) = fsp->scsi_resid; + sc_cmd->result = (DID_ERROR << 16) | fsp->cdb_status; + } + break; + case FC_DATA_OVRRUN: + /* + * overrun is an error + */ + sc_cmd->result = (DID_ERROR << 16) | fsp->cdb_status; + break; + case FC_CMD_ABORTED: + sc_cmd->result = (DID_ABORT << 16) | fsp->io_status; + break; + case FC_CMD_TIME_OUT: + sc_cmd->result = (DID_BUS_BUSY << 16) | fsp->io_status; + break; + case FC_CMD_RESET: + sc_cmd->result = (DID_RESET << 16); + break; + case FC_HRD_ERROR: + sc_cmd->result = (DID_NO_CONNECT << 16); + break; + default: + sc_cmd->result = (DID_ERROR << 16); + break; + } + + list_del(&fsp->list); + sc_cmd->SCp.ptr = NULL; + sc_cmd->scsi_done(sc_cmd); + spin_unlock_irqrestore(lp->host->host_lock, flags); + + /* release ref from initial allocation in queue command */ + fc_fcp_pkt_release(fsp); +} + +/** + * fc_fcp_complete - complete processing of a fcp packet + * @fsp: fcp packet + * + * This function may sleep if a fsp timer is pending. + * The host lock must not be held by caller. + */ +void fc_fcp_complete(struct fc_fcp_pkt *fsp) +{ + if (fc_fcp_lock_pkt(fsp)) + return; + + fc_fcp_complete_locked(fsp); + fc_fcp_unlock_pkt(fsp); +} +EXPORT_SYMBOL(fc_fcp_complete); + +/** + * fc_eh_abort - Abort a command...from scsi host template + * @sc_cmd: scsi command to abort + * + * send ABTS to the target device and wait for the response + * sc_cmd is the pointer to the command to be aborted. + */ +int fc_eh_abort(struct scsi_cmnd *sc_cmd) +{ + struct fc_fcp_pkt *fsp; + struct fc_lport *lp; + int rc = FAILED; + unsigned long flags; + + lp = shost_priv(sc_cmd->device->host); + if (lp->state != LPORT_ST_READY) + return rc; + else if (!(lp->link_status & FC_LINK_UP)) + return rc; + + spin_lock_irqsave(lp->host->host_lock, flags); + fsp = CMD_SP(sc_cmd); + if (!fsp) { + /* command completed while scsi eh was setting up */ + spin_unlock_irqrestore(lp->host->host_lock, flags); + return SUCCESS; + } + /* grab a ref so the fsp and sc_cmd cannot be relased from under us */ + fc_fcp_pkt_hold(fsp); + spin_unlock_irqrestore(lp->host->host_lock, flags); + + if (fc_fcp_lock_pkt(fsp)) { + /* completed while we were waiting for timer to be deleted */ + rc = SUCCESS; + goto release_pkt; + } + + rc = fc_fcp_pkt_abort(lp, fsp); + fc_fcp_unlock_pkt(fsp); + +release_pkt: + fc_fcp_pkt_release(fsp); + return rc; +} +EXPORT_SYMBOL(fc_eh_abort); + +/** + * fc_eh_device_reset: Reset a single LUN + * @sc_cmd: scsi command + * + * Set from scsi host template to send tm cmd to the target and wait for the + * response. + */ +int fc_eh_device_reset(struct scsi_cmnd *sc_cmd) +{ + struct fc_lport *lp; + struct fc_fcp_pkt *fsp; + struct fc_rport *rport = starget_to_rport(scsi_target(sc_cmd->device)); + int rc = FAILED; + struct fc_rport_libfc_priv *rp; + int rval; + + rval = fc_remote_port_chkready(rport); + if (rval) + goto out; + + rp = rport->dd_data; + lp = shost_priv(sc_cmd->device->host); + + if (lp->state != LPORT_ST_READY) + return rc; + + fsp = fc_fcp_pkt_alloc(lp, GFP_NOIO); + if (fsp == NULL) { + FC_DBG("could not allocate scsi_pkt\n"); + sc_cmd->result = DID_NO_CONNECT << 16; + goto out; + } + + /* + * Build the libfc request pkt. Do not set the scsi cmnd, because + * the sc passed in is not setup for execution like when sent + * through the queuecommand callout. + */ + fsp->lp = lp; /* save the softc ptr */ + fsp->rport = rport; /* set the remote port ptr */ + + /* + * flush outstanding commands + */ + rc = fc_lun_reset(lp, fsp, scmd_id(sc_cmd), sc_cmd->device->lun); + fsp->state = FC_SRB_FREE; + fc_fcp_pkt_release(fsp); + +out: + return rc; +} +EXPORT_SYMBOL(fc_eh_device_reset); + +/** + * fc_eh_host_reset - The reset function will reset the ports on the host. + * @sc_cmd: scsi command + */ +int fc_eh_host_reset(struct scsi_cmnd *sc_cmd) +{ + struct Scsi_Host *shost = sc_cmd->device->host; + struct fc_lport *lp = shost_priv(shost); + unsigned long wait_tmo; + + lp->tt.lport_reset(lp); + wait_tmo = jiffies + FC_HOST_RESET_TIMEOUT; + while (!fc_fcp_lport_queue_ready(lp) && time_before(jiffies, wait_tmo)) + msleep(1000); + + if (fc_fcp_lport_queue_ready(lp)) { + shost_printk(KERN_INFO, shost, "Host reset succeeded.\n"); + return SUCCESS; + } else { + shost_printk(KERN_INFO, shost, "Host reset failed. " + "lport not ready.\n"); + return FAILED; + } +} +EXPORT_SYMBOL(fc_eh_host_reset); + +/** + * fc_slave_alloc - configure queue depth + * @sdev: scsi device + * + * Configures queue depth based on host's cmd_per_len. If not set + * then we use the libfc default. + */ +int fc_slave_alloc(struct scsi_device *sdev) +{ + struct fc_rport *rport = starget_to_rport(scsi_target(sdev)); + int queue_depth; + + if (!rport || fc_remote_port_chkready(rport)) + return -ENXIO; + + if (sdev->tagged_supported) { + if (sdev->host->hostt->cmd_per_lun) + queue_depth = sdev->host->hostt->cmd_per_lun; + else + queue_depth = FC_FCP_DFLT_QUEUE_DEPTH; + scsi_activate_tcq(sdev, queue_depth); + } + return 0; +} +EXPORT_SYMBOL(fc_slave_alloc); + +int fc_change_queue_depth(struct scsi_device *sdev, int qdepth) +{ + scsi_adjust_queue_depth(sdev, scsi_get_tag_type(sdev), qdepth); + return sdev->queue_depth; +} +EXPORT_SYMBOL(fc_change_queue_depth); + +int fc_change_queue_type(struct scsi_device *sdev, int tag_type) +{ + if (sdev->tagged_supported) { + scsi_set_tag_type(sdev, tag_type); + if (tag_type) + scsi_activate_tcq(sdev, sdev->queue_depth); + else + scsi_deactivate_tcq(sdev, sdev->queue_depth); + } else + tag_type = 0; + + return tag_type; +} +EXPORT_SYMBOL(fc_change_queue_type); + +void fc_fcp_destroy(struct fc_lport *lp) +{ + struct fc_fcp_internal *si = fc_get_scsi_internal(lp); + + if (!list_empty(&si->scsi_pkt_queue)) + printk(KERN_ERR "Leaked scsi packets.\n"); + + mempool_destroy(si->scsi_pkt_pool); + kfree(si); + lp->scsi_priv = NULL; +} +EXPORT_SYMBOL(fc_fcp_destroy); + +int fc_fcp_init(struct fc_lport *lp) +{ + int rc; + struct fc_fcp_internal *si; + + if (!lp->tt.fcp_cmd_send) + lp->tt.fcp_cmd_send = fc_fcp_cmd_send; + + if (!lp->tt.fcp_cleanup) + lp->tt.fcp_cleanup = fc_fcp_cleanup; + + if (!lp->tt.fcp_abort_io) + lp->tt.fcp_abort_io = fc_fcp_abort_io; + + si = kzalloc(sizeof(struct fc_fcp_internal), GFP_KERNEL); + if (!si) + return -ENOMEM; + lp->scsi_priv = si; + INIT_LIST_HEAD(&si->scsi_pkt_queue); + + si->scsi_pkt_pool = mempool_create_slab_pool(2, scsi_pkt_cachep); + if (!si->scsi_pkt_pool) { + rc = -ENOMEM; + goto free_internal; + } + return 0; + +free_internal: + kfree(si); + return rc; +} +EXPORT_SYMBOL(fc_fcp_init); + +static int __init libfc_init(void) +{ + int rc; + + scsi_pkt_cachep = kmem_cache_create("libfc_fcp_pkt", + sizeof(struct fc_fcp_pkt), + 0, SLAB_HWCACHE_ALIGN, NULL); + if (scsi_pkt_cachep == NULL) { + FC_DBG("Unable to allocate SRB cache...module load failed!"); + return -ENOMEM; + } + + rc = fc_setup_exch_mgr(); + if (rc) + goto destroy_pkt_cache; + + rc = fc_setup_rport(); + if (rc) + goto destroy_em; + + return rc; +destroy_em: + fc_destroy_exch_mgr(); +destroy_pkt_cache: + kmem_cache_destroy(scsi_pkt_cachep); + return rc; +} + +static void __exit libfc_exit(void) +{ + kmem_cache_destroy(scsi_pkt_cachep); + fc_destroy_exch_mgr(); + fc_destroy_rport(); +} + +module_init(libfc_init); +module_exit(libfc_exit); diff --git a/drivers/scsi/libfc/fc_frame.c b/drivers/scsi/libfc/fc_frame.c new file mode 100644 index 00000000000..63fe00cfe66 --- /dev/null +++ b/drivers/scsi/libfc/fc_frame.c @@ -0,0 +1,89 @@ +/* + * Copyright(c) 2007 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + * + * Maintained at www.Open-FCoE.org + */ + +/* + * Frame allocation. + */ +#include +#include +#include +#include + +#include + +/* + * Check the CRC in a frame. + */ +u32 fc_frame_crc_check(struct fc_frame *fp) +{ + u32 crc; + u32 error; + const u8 *bp; + unsigned int len; + + WARN_ON(!fc_frame_is_linear(fp)); + fr_flags(fp) &= ~FCPHF_CRC_UNCHECKED; + len = (fr_len(fp) + 3) & ~3; /* round up length to include fill */ + bp = (const u8 *) fr_hdr(fp); + crc = ~crc32(~0, bp, len); + error = crc ^ fr_crc(fp); + return error; +} +EXPORT_SYMBOL(fc_frame_crc_check); + +/* + * Allocate a frame intended to be sent via fcoe_xmit. + * Get an sk_buff for the frame and set the length. + */ +struct fc_frame *__fc_frame_alloc(size_t len) +{ + struct fc_frame *fp; + struct sk_buff *skb; + + WARN_ON((len % sizeof(u32)) != 0); + len += sizeof(struct fc_frame_header); + skb = dev_alloc_skb(len + FC_FRAME_HEADROOM + FC_FRAME_TAILROOM); + if (!skb) + return NULL; + fp = (struct fc_frame *) skb; + fc_frame_init(fp); + skb_reserve(skb, FC_FRAME_HEADROOM); + skb_put(skb, len); + return fp; +} +EXPORT_SYMBOL(__fc_frame_alloc); + + +struct fc_frame *fc_frame_alloc_fill(struct fc_lport *lp, size_t payload_len) +{ + struct fc_frame *fp; + size_t fill; + + fill = payload_len % 4; + if (fill != 0) + fill = 4 - fill; + fp = __fc_frame_alloc(payload_len + fill); + if (fp) { + memset((char *) fr_hdr(fp) + payload_len, 0, fill); + /* trim is OK, we just allocated it so there are no fragments */ + skb_trim(fp_skb(fp), + payload_len + sizeof(struct fc_frame_header)); + } + return fp; +} diff --git a/drivers/scsi/libfc/fc_lport.c b/drivers/scsi/libfc/fc_lport.c new file mode 100644 index 00000000000..0b9bdb1fb80 --- /dev/null +++ b/drivers/scsi/libfc/fc_lport.c @@ -0,0 +1,1604 @@ +/* + * Copyright(c) 2007 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + * + * Maintained at www.Open-FCoE.org + */ + +/* + * PORT LOCKING NOTES + * + * These comments only apply to the 'port code' which consists of the lport, + * disc and rport blocks. + * + * MOTIVATION + * + * The lport, disc and rport blocks all have mutexes that are used to protect + * those objects. The main motivation for these locks is to prevent from + * having an lport reset just before we send a frame. In that scenario the + * lport's FID would get set to zero and then we'd send a frame with an + * invalid SID. We also need to ensure that states don't change unexpectedly + * while processing another state. + * + * HEIRARCHY + * + * The following heirarchy defines the locking rules. A greater lock + * may be held before acquiring a lesser lock, but a lesser lock should never + * be held while attempting to acquire a greater lock. Here is the heirarchy- + * + * lport > disc, lport > rport, disc > rport + * + * CALLBACKS + * + * The callbacks cause complications with this scheme. There is a callback + * from the rport (to either lport or disc) and a callback from disc + * (to the lport). + * + * As rports exit the rport state machine a callback is made to the owner of + * the rport to notify success or failure. Since the callback is likely to + * cause the lport or disc to grab its lock we cannot hold the rport lock + * while making the callback. To ensure that the rport is not free'd while + * processing the callback the rport callbacks are serialized through a + * single-threaded workqueue. An rport would never be free'd while in a + * callback handler becuase no other rport work in this queue can be executed + * at the same time. + * + * When discovery succeeds or fails a callback is made to the lport as + * notification. Currently, succesful discovery causes the lport to take no + * action. A failure will cause the lport to reset. There is likely a circular + * locking problem with this implementation. + */ + +/* + * LPORT LOCKING + * + * The critical sections protected by the lport's mutex are quite broad and + * may be improved upon in the future. The lport code and its locking doesn't + * influence the I/O path, so excessive locking doesn't penalize I/O + * performance. + * + * The strategy is to lock whenever processing a request or response. Note + * that every _enter_* function corresponds to a state change. They generally + * change the lports state and then send a request out on the wire. We lock + * before calling any of these functions to protect that state change. This + * means that the entry points into the lport block manage the locks while + * the state machine can transition between states (i.e. _enter_* functions) + * while always staying protected. + * + * When handling responses we also hold the lport mutex broadly. When the + * lport receives the response frame it locks the mutex and then calls the + * appropriate handler for the particuar response. Generally a response will + * trigger a state change and so the lock must already be held. + * + * Retries also have to consider the locking. The retries occur from a work + * context and the work function will lock the lport and then retry the state + * (i.e. _enter_* function). + */ + +#include +#include + +#include + +#include +#include + +/* Fabric IDs to use for point-to-point mode, chosen on whims. */ +#define FC_LOCAL_PTP_FID_LO 0x010101 +#define FC_LOCAL_PTP_FID_HI 0x010102 + +#define DNS_DELAY 3 /* Discovery delay after RSCN (in seconds)*/ + +static int fc_lport_debug; + +#define FC_DEBUG_LPORT(fmt...) \ + do { \ + if (fc_lport_debug) \ + FC_DBG(fmt); \ + } while (0) + +static void fc_lport_error(struct fc_lport *, struct fc_frame *); + +static void fc_lport_enter_reset(struct fc_lport *); +static void fc_lport_enter_flogi(struct fc_lport *); +static void fc_lport_enter_dns(struct fc_lport *); +static void fc_lport_enter_rpn_id(struct fc_lport *); +static void fc_lport_enter_rft_id(struct fc_lport *); +static void fc_lport_enter_scr(struct fc_lport *); +static void fc_lport_enter_ready(struct fc_lport *); +static void fc_lport_enter_logo(struct fc_lport *); + +static const char *fc_lport_state_names[] = { + [LPORT_ST_NONE] = "none", + [LPORT_ST_FLOGI] = "FLOGI", + [LPORT_ST_DNS] = "dNS", + [LPORT_ST_RPN_ID] = "RPN_ID", + [LPORT_ST_RFT_ID] = "RFT_ID", + [LPORT_ST_SCR] = "SCR", + [LPORT_ST_READY] = "Ready", + [LPORT_ST_LOGO] = "LOGO", + [LPORT_ST_RESET] = "reset", +}; + +static int fc_frame_drop(struct fc_lport *lport, struct fc_frame *fp) +{ + fc_frame_free(fp); + return 0; +} + +/** + * fc_lport_rport_callback - Event handler for rport events + * @lport: The lport which is receiving the event + * @rport: The rport which the event has occured on + * @event: The event that occured + * + * Locking Note: The rport lock should not be held when calling + * this function. + */ +static void fc_lport_rport_callback(struct fc_lport *lport, + struct fc_rport *rport, + enum fc_rport_event event) +{ + FC_DEBUG_LPORT("Received a %d event for port (%6x)\n", event, + rport->port_id); + + switch (event) { + case RPORT_EV_CREATED: + if (rport->port_id == FC_FID_DIR_SERV) { + mutex_lock(&lport->lp_mutex); + if (lport->state == LPORT_ST_DNS) { + lport->dns_rp = rport; + fc_lport_enter_rpn_id(lport); + } else { + FC_DEBUG_LPORT("Received an CREATED event on " + "port (%6x) for the directory " + "server, but the lport is not " + "in the DNS state, it's in the " + "%d state", rport->port_id, + lport->state); + lport->tt.rport_logoff(rport); + } + mutex_unlock(&lport->lp_mutex); + } else + FC_DEBUG_LPORT("Received an event for port (%6x) " + "which is not the directory server\n", + rport->port_id); + break; + case RPORT_EV_LOGO: + case RPORT_EV_FAILED: + case RPORT_EV_STOP: + if (rport->port_id == FC_FID_DIR_SERV) { + mutex_lock(&lport->lp_mutex); + lport->dns_rp = NULL; + mutex_unlock(&lport->lp_mutex); + + } else + FC_DEBUG_LPORT("Received an event for port (%6x) " + "which is not the directory server\n", + rport->port_id); + break; + case RPORT_EV_NONE: + break; + } +} + +/** + * fc_lport_state - Return a string which represents the lport's state + * @lport: The lport whose state is to converted to a string + */ +static const char *fc_lport_state(struct fc_lport *lport) +{ + const char *cp; + + cp = fc_lport_state_names[lport->state]; + if (!cp) + cp = "unknown"; + return cp; +} + +/** + * fc_lport_ptp_setup - Create an rport for point-to-point mode + * @lport: The lport to attach the ptp rport to + * @fid: The FID of the ptp rport + * @remote_wwpn: The WWPN of the ptp rport + * @remote_wwnn: The WWNN of the ptp rport + */ +static void fc_lport_ptp_setup(struct fc_lport *lport, + u32 remote_fid, u64 remote_wwpn, + u64 remote_wwnn) +{ + struct fc_disc_port dp; + + dp.lp = lport; + dp.ids.port_id = remote_fid; + dp.ids.port_name = remote_wwpn; + dp.ids.node_name = remote_wwnn; + dp.ids.roles = FC_RPORT_ROLE_UNKNOWN; + + if (lport->ptp_rp) { + lport->tt.rport_logoff(lport->ptp_rp); + lport->ptp_rp = NULL; + } + + lport->ptp_rp = fc_rport_rogue_create(&dp); + + lport->tt.rport_login(lport->ptp_rp); + + fc_lport_enter_ready(lport); +} + +void fc_get_host_port_type(struct Scsi_Host *shost) +{ + /* TODO - currently just NPORT */ + fc_host_port_type(shost) = FC_PORTTYPE_NPORT; +} +EXPORT_SYMBOL(fc_get_host_port_type); + +void fc_get_host_port_state(struct Scsi_Host *shost) +{ + struct fc_lport *lp = shost_priv(shost); + + if ((lp->link_status & FC_LINK_UP) == FC_LINK_UP) + fc_host_port_state(shost) = FC_PORTSTATE_ONLINE; + else + fc_host_port_state(shost) = FC_PORTSTATE_OFFLINE; +} +EXPORT_SYMBOL(fc_get_host_port_state); + +void fc_get_host_speed(struct Scsi_Host *shost) +{ + struct fc_lport *lport = shost_priv(shost); + + fc_host_speed(shost) = lport->link_speed; +} +EXPORT_SYMBOL(fc_get_host_speed); + +struct fc_host_statistics *fc_get_host_stats(struct Scsi_Host *shost) +{ + int i; + struct fc_host_statistics *fcoe_stats; + struct fc_lport *lp = shost_priv(shost); + struct timespec v0, v1; + + fcoe_stats = &lp->host_stats; + memset(fcoe_stats, 0, sizeof(struct fc_host_statistics)); + + jiffies_to_timespec(jiffies, &v0); + jiffies_to_timespec(lp->boot_time, &v1); + fcoe_stats->seconds_since_last_reset = (v0.tv_sec - v1.tv_sec); + + for_each_online_cpu(i) { + struct fcoe_dev_stats *stats = lp->dev_stats[i]; + if (stats == NULL) + continue; + fcoe_stats->tx_frames += stats->TxFrames; + fcoe_stats->tx_words += stats->TxWords; + fcoe_stats->rx_frames += stats->RxFrames; + fcoe_stats->rx_words += stats->RxWords; + fcoe_stats->error_frames += stats->ErrorFrames; + fcoe_stats->invalid_crc_count += stats->InvalidCRCCount; + fcoe_stats->fcp_input_requests += stats->InputRequests; + fcoe_stats->fcp_output_requests += stats->OutputRequests; + fcoe_stats->fcp_control_requests += stats->ControlRequests; + fcoe_stats->fcp_input_megabytes += stats->InputMegabytes; + fcoe_stats->fcp_output_megabytes += stats->OutputMegabytes; + fcoe_stats->link_failure_count += stats->LinkFailureCount; + } + fcoe_stats->lip_count = -1; + fcoe_stats->nos_count = -1; + fcoe_stats->loss_of_sync_count = -1; + fcoe_stats->loss_of_signal_count = -1; + fcoe_stats->prim_seq_protocol_err_count = -1; + fcoe_stats->dumped_frames = -1; + return fcoe_stats; +} +EXPORT_SYMBOL(fc_get_host_stats); + +/* + * Fill in FLOGI command for request. + */ +static void +fc_lport_flogi_fill(struct fc_lport *lport, struct fc_els_flogi *flogi, + unsigned int op) +{ + struct fc_els_csp *sp; + struct fc_els_cssp *cp; + + memset(flogi, 0, sizeof(*flogi)); + flogi->fl_cmd = (u8) op; + put_unaligned_be64(lport->wwpn, &flogi->fl_wwpn); + put_unaligned_be64(lport->wwnn, &flogi->fl_wwnn); + sp = &flogi->fl_csp; + sp->sp_hi_ver = 0x20; + sp->sp_lo_ver = 0x20; + sp->sp_bb_cred = htons(10); /* this gets set by gateway */ + sp->sp_bb_data = htons((u16) lport->mfs); + cp = &flogi->fl_cssp[3 - 1]; /* class 3 parameters */ + cp->cp_class = htons(FC_CPC_VALID | FC_CPC_SEQ); + if (op != ELS_FLOGI) { + sp->sp_features = htons(FC_SP_FT_CIRO); + sp->sp_tot_seq = htons(255); /* seq. we accept */ + sp->sp_rel_off = htons(0x1f); + sp->sp_e_d_tov = htonl(lport->e_d_tov); + + cp->cp_rdfs = htons((u16) lport->mfs); + cp->cp_con_seq = htons(255); + cp->cp_open_seq = 1; + } +} + +/* + * Add a supported FC-4 type. + */ +static void fc_lport_add_fc4_type(struct fc_lport *lport, enum fc_fh_type type) +{ + __be32 *mp; + + mp = &lport->fcts.ff_type_map[type / FC_NS_BPW]; + *mp = htonl(ntohl(*mp) | 1UL << (type % FC_NS_BPW)); +} + +/** + * fc_lport_recv_rlir_req - Handle received Registered Link Incident Report. + * @lport: Fibre Channel local port recieving the RLIR + * @sp: current sequence in the RLIR exchange + * @fp: RLIR request frame + * + * Locking Note: The lport lock is exected to be held before calling + * this function. + */ +static void fc_lport_recv_rlir_req(struct fc_seq *sp, struct fc_frame *fp, + struct fc_lport *lport) +{ + FC_DEBUG_LPORT("Received RLIR request while in state %s\n", + fc_lport_state(lport)); + + lport->tt.seq_els_rsp_send(sp, ELS_LS_ACC, NULL); + fc_frame_free(fp); +} + +/** + * fc_lport_recv_echo_req - Handle received ECHO request + * @lport: Fibre Channel local port recieving the ECHO + * @sp: current sequence in the ECHO exchange + * @fp: ECHO request frame + * + * Locking Note: The lport lock is exected to be held before calling + * this function. + */ +static void fc_lport_recv_echo_req(struct fc_seq *sp, struct fc_frame *in_fp, + struct fc_lport *lport) +{ + struct fc_frame *fp; + struct fc_exch *ep = fc_seq_exch(sp); + unsigned int len; + void *pp; + void *dp; + u32 f_ctl; + + FC_DEBUG_LPORT("Received RLIR request while in state %s\n", + fc_lport_state(lport)); + + len = fr_len(in_fp) - sizeof(struct fc_frame_header); + pp = fc_frame_payload_get(in_fp, len); + + if (len < sizeof(__be32)) + len = sizeof(__be32); + + fp = fc_frame_alloc(lport, len); + if (fp) { + dp = fc_frame_payload_get(fp, len); + memcpy(dp, pp, len); + *((u32 *)dp) = htonl(ELS_LS_ACC << 24); + sp = lport->tt.seq_start_next(sp); + f_ctl = FC_FC_EX_CTX | FC_FC_LAST_SEQ | FC_FC_END_SEQ; + fc_fill_fc_hdr(fp, FC_RCTL_ELS_REP, ep->did, ep->sid, + FC_TYPE_ELS, f_ctl, 0); + lport->tt.seq_send(lport, sp, fp); + } + fc_frame_free(in_fp); +} + +/** + * fc_lport_recv_echo_req - Handle received Request Node ID data request + * @lport: Fibre Channel local port recieving the RNID + * @sp: current sequence in the RNID exchange + * @fp: RNID request frame + * + * Locking Note: The lport lock is exected to be held before calling + * this function. + */ +static void fc_lport_recv_rnid_req(struct fc_seq *sp, struct fc_frame *in_fp, + struct fc_lport *lport) +{ + struct fc_frame *fp; + struct fc_exch *ep = fc_seq_exch(sp); + struct fc_els_rnid *req; + struct { + struct fc_els_rnid_resp rnid; + struct fc_els_rnid_cid cid; + struct fc_els_rnid_gen gen; + } *rp; + struct fc_seq_els_data rjt_data; + u8 fmt; + size_t len; + u32 f_ctl; + + FC_DEBUG_LPORT("Received RNID request while in state %s\n", + fc_lport_state(lport)); + + req = fc_frame_payload_get(in_fp, sizeof(*req)); + if (!req) { + rjt_data.fp = NULL; + rjt_data.reason = ELS_RJT_LOGIC; + rjt_data.explan = ELS_EXPL_NONE; + lport->tt.seq_els_rsp_send(sp, ELS_LS_RJT, &rjt_data); + } else { + fmt = req->rnid_fmt; + len = sizeof(*rp); + if (fmt != ELS_RNIDF_GEN || + ntohl(lport->rnid_gen.rnid_atype) == 0) { + fmt = ELS_RNIDF_NONE; /* nothing to provide */ + len -= sizeof(rp->gen); + } + fp = fc_frame_alloc(lport, len); + if (fp) { + rp = fc_frame_payload_get(fp, len); + memset(rp, 0, len); + rp->rnid.rnid_cmd = ELS_LS_ACC; + rp->rnid.rnid_fmt = fmt; + rp->rnid.rnid_cid_len = sizeof(rp->cid); + rp->cid.rnid_wwpn = htonll(lport->wwpn); + rp->cid.rnid_wwnn = htonll(lport->wwnn); + if (fmt == ELS_RNIDF_GEN) { + rp->rnid.rnid_sid_len = sizeof(rp->gen); + memcpy(&rp->gen, &lport->rnid_gen, + sizeof(rp->gen)); + } + sp = lport->tt.seq_start_next(sp); + f_ctl = FC_FC_EX_CTX | FC_FC_LAST_SEQ; + f_ctl |= FC_FC_END_SEQ | FC_FC_SEQ_INIT; + fc_fill_fc_hdr(fp, FC_RCTL_ELS_REP, ep->did, ep->sid, + FC_TYPE_ELS, f_ctl, 0); + lport->tt.seq_send(lport, sp, fp); + } + } + fc_frame_free(in_fp); +} + +/** + * fc_lport_recv_adisc_req - Handle received Address Discovery Request + * @lport: Fibre Channel local port recieving the ADISC + * @sp: current sequence in the ADISC exchange + * @fp: ADISC request frame + * + * Locking Note: The lport lock is expected to be held before calling + * this function. + */ +static void fc_lport_recv_adisc_req(struct fc_seq *sp, struct fc_frame *in_fp, + struct fc_lport *lport) +{ + struct fc_frame *fp; + struct fc_exch *ep = fc_seq_exch(sp); + struct fc_els_adisc *req, *rp; + struct fc_seq_els_data rjt_data; + size_t len; + u32 f_ctl; + + FC_DEBUG_LPORT("Received ADISC request while in state %s\n", + fc_lport_state(lport)); + + req = fc_frame_payload_get(in_fp, sizeof(*req)); + if (!req) { + rjt_data.fp = NULL; + rjt_data.reason = ELS_RJT_LOGIC; + rjt_data.explan = ELS_EXPL_NONE; + lport->tt.seq_els_rsp_send(sp, ELS_LS_RJT, &rjt_data); + } else { + len = sizeof(*rp); + fp = fc_frame_alloc(lport, len); + if (fp) { + rp = fc_frame_payload_get(fp, len); + memset(rp, 0, len); + rp->adisc_cmd = ELS_LS_ACC; + rp->adisc_wwpn = htonll(lport->wwpn); + rp->adisc_wwnn = htonll(lport->wwnn); + hton24(rp->adisc_port_id, + fc_host_port_id(lport->host)); + sp = lport->tt.seq_start_next(sp); + f_ctl = FC_FC_EX_CTX | FC_FC_LAST_SEQ; + f_ctl |= FC_FC_END_SEQ | FC_FC_SEQ_INIT; + fc_fill_fc_hdr(fp, FC_RCTL_ELS_REP, ep->did, ep->sid, + FC_TYPE_ELS, f_ctl, 0); + lport->tt.seq_send(lport, sp, fp); + } + } + fc_frame_free(in_fp); +} + +/** + * fc_lport_recv_logo_req - Handle received fabric LOGO request + * @lport: Fibre Channel local port recieving the LOGO + * @sp: current sequence in the LOGO exchange + * @fp: LOGO request frame + * + * Locking Note: The lport lock is exected to be held before calling + * this function. + */ +static void fc_lport_recv_logo_req(struct fc_seq *sp, struct fc_frame *fp, + struct fc_lport *lport) +{ + lport->tt.seq_els_rsp_send(sp, ELS_LS_ACC, NULL); + fc_lport_enter_reset(lport); + fc_frame_free(fp); +} + +/** + * fc_fabric_login - Start the lport state machine + * @lport: The lport that should log into the fabric + * + * Locking Note: This function should not be called + * with the lport lock held. + */ +int fc_fabric_login(struct fc_lport *lport) +{ + int rc = -1; + + mutex_lock(&lport->lp_mutex); + if (lport->state == LPORT_ST_NONE) { + fc_lport_enter_reset(lport); + rc = 0; + } + mutex_unlock(&lport->lp_mutex); + + return rc; +} +EXPORT_SYMBOL(fc_fabric_login); + +/** + * fc_linkup - Handler for transport linkup events + * @lport: The lport whose link is up + */ +void fc_linkup(struct fc_lport *lport) +{ + FC_DEBUG_LPORT("Link is up for port (%6x)\n", + fc_host_port_id(lport->host)); + + mutex_lock(&lport->lp_mutex); + if ((lport->link_status & FC_LINK_UP) != FC_LINK_UP) { + lport->link_status |= FC_LINK_UP; + + if (lport->state == LPORT_ST_RESET) + fc_lport_enter_flogi(lport); + } + mutex_unlock(&lport->lp_mutex); +} +EXPORT_SYMBOL(fc_linkup); + +/** + * fc_linkdown - Handler for transport linkdown events + * @lport: The lport whose link is down + */ +void fc_linkdown(struct fc_lport *lport) +{ + mutex_lock(&lport->lp_mutex); + FC_DEBUG_LPORT("Link is down for port (%6x)\n", + fc_host_port_id(lport->host)); + + if ((lport->link_status & FC_LINK_UP) == FC_LINK_UP) { + lport->link_status &= ~(FC_LINK_UP); + fc_lport_enter_reset(lport); + lport->tt.fcp_cleanup(lport); + } + mutex_unlock(&lport->lp_mutex); +} +EXPORT_SYMBOL(fc_linkdown); + +/** + * fc_pause - Pause the flow of frames + * @lport: The lport to be paused + */ +void fc_pause(struct fc_lport *lport) +{ + mutex_lock(&lport->lp_mutex); + lport->link_status |= FC_PAUSE; + mutex_unlock(&lport->lp_mutex); +} +EXPORT_SYMBOL(fc_pause); + +/** + * fc_unpause - Unpause the flow of frames + * @lport: The lport to be unpaused + */ +void fc_unpause(struct fc_lport *lport) +{ + mutex_lock(&lport->lp_mutex); + lport->link_status &= ~(FC_PAUSE); + mutex_unlock(&lport->lp_mutex); +} +EXPORT_SYMBOL(fc_unpause); + +/** + * fc_fabric_logoff - Logout of the fabric + * @lport: fc_lport pointer to logoff the fabric + * + * Return value: + * 0 for success, -1 for failure + **/ +int fc_fabric_logoff(struct fc_lport *lport) +{ + lport->tt.disc_stop_final(lport); + mutex_lock(&lport->lp_mutex); + fc_lport_enter_logo(lport); + mutex_unlock(&lport->lp_mutex); + return 0; +} +EXPORT_SYMBOL(fc_fabric_logoff); + +/** + * fc_lport_destroy - unregister a fc_lport + * @lport: fc_lport pointer to unregister + * + * Return value: + * None + * Note: + * exit routine for fc_lport instance + * clean-up all the allocated memory + * and free up other system resources. + * + **/ +int fc_lport_destroy(struct fc_lport *lport) +{ + lport->tt.frame_send = fc_frame_drop; + lport->tt.fcp_abort_io(lport); + lport->tt.exch_mgr_reset(lport->emp, 0, 0); + return 0; +} +EXPORT_SYMBOL(fc_lport_destroy); + +/** + * fc_set_mfs - sets up the mfs for the corresponding fc_lport + * @lport: fc_lport pointer to unregister + * @mfs: the new mfs for fc_lport + * + * Set mfs for the given fc_lport to the new mfs. + * + * Return: 0 for success + * + **/ +int fc_set_mfs(struct fc_lport *lport, u32 mfs) +{ + unsigned int old_mfs; + int rc = -EINVAL; + + mutex_lock(&lport->lp_mutex); + + old_mfs = lport->mfs; + + if (mfs >= FC_MIN_MAX_FRAME) { + mfs &= ~3; + if (mfs > FC_MAX_FRAME) + mfs = FC_MAX_FRAME; + mfs -= sizeof(struct fc_frame_header); + lport->mfs = mfs; + rc = 0; + } + + if (!rc && mfs < old_mfs) + fc_lport_enter_reset(lport); + + mutex_unlock(&lport->lp_mutex); + + return rc; +} +EXPORT_SYMBOL(fc_set_mfs); + +/** + * fc_lport_disc_callback - Callback for discovery events + * @lport: FC local port + * @event: The discovery event + */ +void fc_lport_disc_callback(struct fc_lport *lport, enum fc_disc_event event) +{ + switch (event) { + case DISC_EV_SUCCESS: + FC_DEBUG_LPORT("Got a SUCCESS event for port (%6x)\n", + fc_host_port_id(lport->host)); + break; + case DISC_EV_FAILED: + FC_DEBUG_LPORT("Got a FAILED event for port (%6x)\n", + fc_host_port_id(lport->host)); + mutex_lock(&lport->lp_mutex); + fc_lport_enter_reset(lport); + mutex_unlock(&lport->lp_mutex); + break; + case DISC_EV_NONE: + WARN_ON(1); + break; + } +} + +/** + * fc_rport_enter_ready - Enter the ready state and start discovery + * @lport: Fibre Channel local port that is ready + * + * Locking Note: The lport lock is expected to be held before calling + * this routine. + */ +static void fc_lport_enter_ready(struct fc_lport *lport) +{ + FC_DEBUG_LPORT("Port (%6x) entered Ready from state %s\n", + fc_host_port_id(lport->host), fc_lport_state(lport)); + + fc_lport_state_enter(lport, LPORT_ST_READY); + + lport->tt.disc_start(fc_lport_disc_callback, lport); +} + +/** + * fc_lport_recv_flogi_req - Receive a FLOGI request + * @sp_in: The sequence the FLOGI is on + * @rx_fp: The frame the FLOGI is in + * @lport: The lport that recieved the request + * + * A received FLOGI request indicates a point-to-point connection. + * Accept it with the common service parameters indicating our N port. + * Set up to do a PLOGI if we have the higher-number WWPN. + * + * Locking Note: The lport lock is exected to be held before calling + * this function. + */ +static void fc_lport_recv_flogi_req(struct fc_seq *sp_in, + struct fc_frame *rx_fp, + struct fc_lport *lport) +{ + struct fc_frame *fp; + struct fc_frame_header *fh; + struct fc_seq *sp; + struct fc_exch *ep; + struct fc_els_flogi *flp; + struct fc_els_flogi *new_flp; + u64 remote_wwpn; + u32 remote_fid; + u32 local_fid; + u32 f_ctl; + + FC_DEBUG_LPORT("Received FLOGI request while in state %s\n", + fc_lport_state(lport)); + + fh = fc_frame_header_get(rx_fp); + remote_fid = ntoh24(fh->fh_s_id); + flp = fc_frame_payload_get(rx_fp, sizeof(*flp)); + if (!flp) + goto out; + remote_wwpn = get_unaligned_be64(&flp->fl_wwpn); + if (remote_wwpn == lport->wwpn) { + FC_DBG("FLOGI from port with same WWPN %llx " + "possible configuration error\n", remote_wwpn); + goto out; + } + FC_DBG("FLOGI from port WWPN %llx\n", remote_wwpn); + + /* + * XXX what is the right thing to do for FIDs? + * The originator might expect our S_ID to be 0xfffffe. + * But if so, both of us could end up with the same FID. + */ + local_fid = FC_LOCAL_PTP_FID_LO; + if (remote_wwpn < lport->wwpn) { + local_fid = FC_LOCAL_PTP_FID_HI; + if (!remote_fid || remote_fid == local_fid) + remote_fid = FC_LOCAL_PTP_FID_LO; + } else if (!remote_fid) { + remote_fid = FC_LOCAL_PTP_FID_HI; + } + + fc_host_port_id(lport->host) = local_fid; + + fp = fc_frame_alloc(lport, sizeof(*flp)); + if (fp) { + sp = lport->tt.seq_start_next(fr_seq(rx_fp)); + new_flp = fc_frame_payload_get(fp, sizeof(*flp)); + fc_lport_flogi_fill(lport, new_flp, ELS_FLOGI); + new_flp->fl_cmd = (u8) ELS_LS_ACC; + + /* + * Send the response. If this fails, the originator should + * repeat the sequence. + */ + f_ctl = FC_FC_EX_CTX | FC_FC_LAST_SEQ | FC_FC_END_SEQ; + ep = fc_seq_exch(sp); + fc_fill_fc_hdr(fp, FC_RCTL_ELS_REP, ep->did, ep->sid, + FC_TYPE_ELS, f_ctl, 0); + lport->tt.seq_send(lport, sp, fp); + + } else { + fc_lport_error(lport, fp); + } + fc_lport_ptp_setup(lport, remote_fid, remote_wwpn, + get_unaligned_be64(&flp->fl_wwnn)); + + lport->tt.disc_start(fc_lport_disc_callback, lport); + +out: + sp = fr_seq(rx_fp); + fc_frame_free(rx_fp); +} + +/** + * fc_lport_recv_req - The generic lport request handler + * @lport: The lport that received the request + * @sp: The sequence the request is on + * @fp: The frame the request is in + * + * This function will see if the lport handles the request or + * if an rport should handle the request. + * + * Locking Note: This function should not be called with the lport + * lock held becuase it will grab the lock. + */ +static void fc_lport_recv_req(struct fc_lport *lport, struct fc_seq *sp, + struct fc_frame *fp) +{ + struct fc_frame_header *fh = fc_frame_header_get(fp); + void (*recv) (struct fc_seq *, struct fc_frame *, struct fc_lport *); + struct fc_rport *rport; + u32 s_id; + u32 d_id; + struct fc_seq_els_data rjt_data; + + mutex_lock(&lport->lp_mutex); + + /* + * Handle special ELS cases like FLOGI, LOGO, and + * RSCN here. These don't require a session. + * Even if we had a session, it might not be ready. + */ + if (fh->fh_type == FC_TYPE_ELS && fh->fh_r_ctl == FC_RCTL_ELS_REQ) { + /* + * Check opcode. + */ + recv = NULL; + switch (fc_frame_payload_op(fp)) { + case ELS_FLOGI: + recv = fc_lport_recv_flogi_req; + break; + case ELS_LOGO: + fh = fc_frame_header_get(fp); + if (ntoh24(fh->fh_s_id) == FC_FID_FLOGI) + recv = fc_lport_recv_logo_req; + break; + case ELS_RSCN: + recv = lport->tt.disc_recv_req; + break; + case ELS_ECHO: + recv = fc_lport_recv_echo_req; + break; + case ELS_RLIR: + recv = fc_lport_recv_rlir_req; + break; + case ELS_RNID: + recv = fc_lport_recv_rnid_req; + break; + case ELS_ADISC: + recv = fc_lport_recv_adisc_req; + break; + } + + if (recv) + recv(sp, fp, lport); + else { + /* + * Find session. + * If this is a new incoming PLOGI, we won't find it. + */ + s_id = ntoh24(fh->fh_s_id); + d_id = ntoh24(fh->fh_d_id); + + rport = lport->tt.rport_lookup(lport, s_id); + if (rport) + lport->tt.rport_recv_req(sp, fp, rport); + else { + rjt_data.fp = NULL; + rjt_data.reason = ELS_RJT_UNAB; + rjt_data.explan = ELS_EXPL_NONE; + lport->tt.seq_els_rsp_send(sp, + ELS_LS_RJT, + &rjt_data); + fc_frame_free(fp); + } + } + } else { + FC_DBG("dropping invalid frame (eof %x)\n", fr_eof(fp)); + fc_frame_free(fp); + } + mutex_unlock(&lport->lp_mutex); + + /* + * The common exch_done for all request may not be good + * if any request requires longer hold on exhange. XXX + */ + lport->tt.exch_done(sp); +} + +/** + * fc_lport_reset - Reset an lport + * @lport: The lport which should be reset + * + * Locking Note: This functions should not be called with the + * lport lock held. + */ +int fc_lport_reset(struct fc_lport *lport) +{ + mutex_lock(&lport->lp_mutex); + fc_lport_enter_reset(lport); + mutex_unlock(&lport->lp_mutex); + return 0; +} +EXPORT_SYMBOL(fc_lport_reset); + +/** + * fc_rport_enter_reset - Reset the local port + * @lport: Fibre Channel local port to be reset + * + * Locking Note: The lport lock is expected to be held before calling + * this routine. + */ +static void fc_lport_enter_reset(struct fc_lport *lport) +{ + FC_DEBUG_LPORT("Port (%6x) entered RESET state from %s state\n", + fc_host_port_id(lport->host), fc_lport_state(lport)); + + fc_lport_state_enter(lport, LPORT_ST_RESET); + + if (lport->dns_rp) + lport->tt.rport_logoff(lport->dns_rp); + + if (lport->ptp_rp) { + lport->tt.rport_logoff(lport->ptp_rp); + lport->ptp_rp = NULL; + } + + lport->tt.disc_stop(lport); + + lport->tt.exch_mgr_reset(lport->emp, 0, 0); + fc_host_fabric_name(lport->host) = 0; + fc_host_port_id(lport->host) = 0; + + if ((lport->link_status & FC_LINK_UP) == FC_LINK_UP) + fc_lport_enter_flogi(lport); +} + +/** + * fc_lport_error - Handler for any errors + * @lport: The fc_lport object + * @fp: The frame pointer + * + * If the error was caused by a resource allocation failure + * then wait for half a second and retry, otherwise retry + * after the e_d_tov time. + */ +static void fc_lport_error(struct fc_lport *lport, struct fc_frame *fp) +{ + unsigned long delay = 0; + FC_DEBUG_LPORT("Error %ld in state %s, retries %d\n", + PTR_ERR(fp), fc_lport_state(lport), + lport->retry_count); + + if (!fp || PTR_ERR(fp) == -FC_EX_TIMEOUT) { + /* + * Memory allocation failure, or the exchange timed out. + * Retry after delay + */ + if (lport->retry_count < lport->max_retry_count) { + lport->retry_count++; + if (!fp) + delay = msecs_to_jiffies(500); + else + delay = msecs_to_jiffies(lport->e_d_tov); + + schedule_delayed_work(&lport->retry_work, delay); + } else { + switch (lport->state) { + case LPORT_ST_NONE: + case LPORT_ST_READY: + case LPORT_ST_RESET: + case LPORT_ST_RPN_ID: + case LPORT_ST_RFT_ID: + case LPORT_ST_SCR: + case LPORT_ST_DNS: + case LPORT_ST_FLOGI: + case LPORT_ST_LOGO: + fc_lport_enter_reset(lport); + break; + } + } + } +} + +/** + * fc_lport_rft_id_resp - Handle response to Register Fibre + * Channel Types by ID (RPN_ID) request + * @sp: current sequence in RPN_ID exchange + * @fp: response frame + * @lp_arg: Fibre Channel host port instance + * + * Locking Note: This function will be called without the lport lock + * held, but it will lock, call an _enter_* function or fc_lport_error + * and then unlock the lport. + */ +static void fc_lport_rft_id_resp(struct fc_seq *sp, struct fc_frame *fp, + void *lp_arg) +{ + struct fc_lport *lport = lp_arg; + struct fc_frame_header *fh; + struct fc_ct_hdr *ct; + + if (fp == ERR_PTR(-FC_EX_CLOSED)) + return; + + mutex_lock(&lport->lp_mutex); + + FC_DEBUG_LPORT("Received a RFT_ID response\n"); + + if (lport->state != LPORT_ST_RFT_ID) { + FC_DBG("Received a RFT_ID response, but in state %s\n", + fc_lport_state(lport)); + goto out; + } + + if (IS_ERR(fp)) { + fc_lport_error(lport, fp); + goto err; + } + + fh = fc_frame_header_get(fp); + ct = fc_frame_payload_get(fp, sizeof(*ct)); + + if (fh && ct && fh->fh_type == FC_TYPE_CT && + ct->ct_fs_type == FC_FST_DIR && + ct->ct_fs_subtype == FC_NS_SUBTYPE && + ntohs(ct->ct_cmd) == FC_FS_ACC) + fc_lport_enter_scr(lport); + else + fc_lport_error(lport, fp); +out: + fc_frame_free(fp); +err: + mutex_unlock(&lport->lp_mutex); +} + +/** + * fc_lport_rpn_id_resp - Handle response to Register Port + * Name by ID (RPN_ID) request + * @sp: current sequence in RPN_ID exchange + * @fp: response frame + * @lp_arg: Fibre Channel host port instance + * + * Locking Note: This function will be called without the lport lock + * held, but it will lock, call an _enter_* function or fc_lport_error + * and then unlock the lport. + */ +static void fc_lport_rpn_id_resp(struct fc_seq *sp, struct fc_frame *fp, + void *lp_arg) +{ + struct fc_lport *lport = lp_arg; + struct fc_frame_header *fh; + struct fc_ct_hdr *ct; + + if (fp == ERR_PTR(-FC_EX_CLOSED)) + return; + + mutex_lock(&lport->lp_mutex); + + FC_DEBUG_LPORT("Received a RPN_ID response\n"); + + if (lport->state != LPORT_ST_RPN_ID) { + FC_DBG("Received a RPN_ID response, but in state %s\n", + fc_lport_state(lport)); + goto out; + } + + if (IS_ERR(fp)) { + fc_lport_error(lport, fp); + goto err; + } + + fh = fc_frame_header_get(fp); + ct = fc_frame_payload_get(fp, sizeof(*ct)); + if (fh && ct && fh->fh_type == FC_TYPE_CT && + ct->ct_fs_type == FC_FST_DIR && + ct->ct_fs_subtype == FC_NS_SUBTYPE && + ntohs(ct->ct_cmd) == FC_FS_ACC) + fc_lport_enter_rft_id(lport); + else + fc_lport_error(lport, fp); + +out: + fc_frame_free(fp); +err: + mutex_unlock(&lport->lp_mutex); +} + +/** + * fc_lport_scr_resp - Handle response to State Change Register (SCR) request + * @sp: current sequence in SCR exchange + * @fp: response frame + * @lp_arg: Fibre Channel lport port instance that sent the registration request + * + * Locking Note: This function will be called without the lport lock + * held, but it will lock, call an _enter_* function or fc_lport_error + * and then unlock the lport. + */ +static void fc_lport_scr_resp(struct fc_seq *sp, struct fc_frame *fp, + void *lp_arg) +{ + struct fc_lport *lport = lp_arg; + u8 op; + + if (fp == ERR_PTR(-FC_EX_CLOSED)) + return; + + mutex_lock(&lport->lp_mutex); + + FC_DEBUG_LPORT("Received a SCR response\n"); + + if (lport->state != LPORT_ST_SCR) { + FC_DBG("Received a SCR response, but in state %s\n", + fc_lport_state(lport)); + goto out; + } + + if (IS_ERR(fp)) { + fc_lport_error(lport, fp); + goto err; + } + + op = fc_frame_payload_op(fp); + if (op == ELS_LS_ACC) + fc_lport_enter_ready(lport); + else + fc_lport_error(lport, fp); + +out: + fc_frame_free(fp); +err: + mutex_unlock(&lport->lp_mutex); +} + +/** + * fc_lport_enter_scr - Send a State Change Register (SCR) request + * @lport: Fibre Channel local port to register for state changes + * + * Locking Note: The lport lock is expected to be held before calling + * this routine. + */ +static void fc_lport_enter_scr(struct fc_lport *lport) +{ + struct fc_frame *fp; + + FC_DEBUG_LPORT("Port (%6x) entered SCR state from %s state\n", + fc_host_port_id(lport->host), fc_lport_state(lport)); + + fc_lport_state_enter(lport, LPORT_ST_SCR); + + fp = fc_frame_alloc(lport, sizeof(struct fc_els_scr)); + if (!fp) { + fc_lport_error(lport, fp); + return; + } + + if (!lport->tt.elsct_send(lport, NULL, fp, ELS_SCR, + fc_lport_scr_resp, lport, lport->e_d_tov)) + fc_lport_error(lport, fp); +} + +/** + * fc_lport_enter_rft_id - Register FC4-types with the name server + * @lport: Fibre Channel local port to register + * + * Locking Note: The lport lock is expected to be held before calling + * this routine. + */ +static void fc_lport_enter_rft_id(struct fc_lport *lport) +{ + struct fc_frame *fp; + struct fc_ns_fts *lps; + int i; + + FC_DEBUG_LPORT("Port (%6x) entered RFT_ID state from %s state\n", + fc_host_port_id(lport->host), fc_lport_state(lport)); + + fc_lport_state_enter(lport, LPORT_ST_RFT_ID); + + lps = &lport->fcts; + i = sizeof(lps->ff_type_map) / sizeof(lps->ff_type_map[0]); + while (--i >= 0) + if (ntohl(lps->ff_type_map[i]) != 0) + break; + if (i < 0) { + /* nothing to register, move on to SCR */ + fc_lport_enter_scr(lport); + return; + } + + fp = fc_frame_alloc(lport, sizeof(struct fc_ct_hdr) + + sizeof(struct fc_ns_rft)); + if (!fp) { + fc_lport_error(lport, fp); + return; + } + + if (!lport->tt.elsct_send(lport, NULL, fp, FC_NS_RFT_ID, + fc_lport_rft_id_resp, + lport, lport->e_d_tov)) + fc_lport_error(lport, fp); +} + +/** + * fc_rport_enter_rft_id - Register port name with the name server + * @lport: Fibre Channel local port to register + * + * Locking Note: The lport lock is expected to be held before calling + * this routine. + */ +static void fc_lport_enter_rpn_id(struct fc_lport *lport) +{ + struct fc_frame *fp; + + FC_DEBUG_LPORT("Port (%6x) entered RPN_ID state from %s state\n", + fc_host_port_id(lport->host), fc_lport_state(lport)); + + fc_lport_state_enter(lport, LPORT_ST_RPN_ID); + + fp = fc_frame_alloc(lport, sizeof(struct fc_ct_hdr) + + sizeof(struct fc_ns_rn_id)); + if (!fp) { + fc_lport_error(lport, fp); + return; + } + + if (!lport->tt.elsct_send(lport, NULL, fp, FC_NS_RPN_ID, + fc_lport_rpn_id_resp, + lport, lport->e_d_tov)) + fc_lport_error(lport, fp); +} + +static struct fc_rport_operations fc_lport_rport_ops = { + .event_callback = fc_lport_rport_callback, +}; + +/** + * fc_rport_enter_dns - Create a rport to the name server + * @lport: Fibre Channel local port requesting a rport for the name server + * + * Locking Note: The lport lock is expected to be held before calling + * this routine. + */ +static void fc_lport_enter_dns(struct fc_lport *lport) +{ + struct fc_rport *rport; + struct fc_rport_libfc_priv *rdata; + struct fc_disc_port dp; + + dp.ids.port_id = FC_FID_DIR_SERV; + dp.ids.port_name = -1; + dp.ids.node_name = -1; + dp.ids.roles = FC_RPORT_ROLE_UNKNOWN; + dp.lp = lport; + + FC_DEBUG_LPORT("Port (%6x) entered DNS state from %s state\n", + fc_host_port_id(lport->host), fc_lport_state(lport)); + + fc_lport_state_enter(lport, LPORT_ST_DNS); + + rport = fc_rport_rogue_create(&dp); + if (!rport) + goto err; + + rdata = rport->dd_data; + rdata->ops = &fc_lport_rport_ops; + lport->tt.rport_login(rport); + return; + +err: + fc_lport_error(lport, NULL); +} + +/** + * fc_lport_timeout - Handler for the retry_work timer. + * @work: The work struct of the fc_lport + */ +static void fc_lport_timeout(struct work_struct *work) +{ + struct fc_lport *lport = + container_of(work, struct fc_lport, + retry_work.work); + + mutex_lock(&lport->lp_mutex); + + switch (lport->state) { + case LPORT_ST_NONE: + case LPORT_ST_READY: + case LPORT_ST_RESET: + WARN_ON(1); + break; + case LPORT_ST_FLOGI: + fc_lport_enter_flogi(lport); + break; + case LPORT_ST_DNS: + fc_lport_enter_dns(lport); + break; + case LPORT_ST_RPN_ID: + fc_lport_enter_rpn_id(lport); + break; + case LPORT_ST_RFT_ID: + fc_lport_enter_rft_id(lport); + break; + case LPORT_ST_SCR: + fc_lport_enter_scr(lport); + break; + case LPORT_ST_LOGO: + fc_lport_enter_logo(lport); + break; + } + + mutex_unlock(&lport->lp_mutex); +} + +/** + * fc_lport_logo_resp - Handle response to LOGO request + * @sp: current sequence in LOGO exchange + * @fp: response frame + * @lp_arg: Fibre Channel lport port instance that sent the LOGO request + * + * Locking Note: This function will be called without the lport lock + * held, but it will lock, call an _enter_* function or fc_lport_error + * and then unlock the lport. + */ +static void fc_lport_logo_resp(struct fc_seq *sp, struct fc_frame *fp, + void *lp_arg) +{ + struct fc_lport *lport = lp_arg; + u8 op; + + if (fp == ERR_PTR(-FC_EX_CLOSED)) + return; + + mutex_lock(&lport->lp_mutex); + + FC_DEBUG_LPORT("Received a LOGO response\n"); + + if (lport->state != LPORT_ST_LOGO) { + FC_DBG("Received a LOGO response, but in state %s\n", + fc_lport_state(lport)); + goto out; + } + + if (IS_ERR(fp)) { + fc_lport_error(lport, fp); + goto err; + } + + op = fc_frame_payload_op(fp); + if (op == ELS_LS_ACC) + fc_lport_enter_reset(lport); + else + fc_lport_error(lport, fp); + +out: + fc_frame_free(fp); +err: + mutex_unlock(&lport->lp_mutex); +} + +/** + * fc_rport_enter_logo - Logout of the fabric + * @lport: Fibre Channel local port to be logged out + * + * Locking Note: The lport lock is expected to be held before calling + * this routine. + */ +static void fc_lport_enter_logo(struct fc_lport *lport) +{ + struct fc_frame *fp; + struct fc_els_logo *logo; + + FC_DEBUG_LPORT("Port (%6x) entered LOGO state from %s state\n", + fc_host_port_id(lport->host), fc_lport_state(lport)); + + fc_lport_state_enter(lport, LPORT_ST_LOGO); + + /* DNS session should be closed so we can release it here */ + if (lport->dns_rp) + lport->tt.rport_logoff(lport->dns_rp); + + fp = fc_frame_alloc(lport, sizeof(*logo)); + if (!fp) { + fc_lport_error(lport, fp); + return; + } + + if (!lport->tt.elsct_send(lport, NULL, fp, ELS_LOGO, fc_lport_logo_resp, + lport, lport->e_d_tov)) + fc_lport_error(lport, fp); +} + +/** + * fc_lport_flogi_resp - Handle response to FLOGI request + * @sp: current sequence in FLOGI exchange + * @fp: response frame + * @lp_arg: Fibre Channel lport port instance that sent the FLOGI request + * + * Locking Note: This function will be called without the lport lock + * held, but it will lock, call an _enter_* function or fc_lport_error + * and then unlock the lport. + */ +static void fc_lport_flogi_resp(struct fc_seq *sp, struct fc_frame *fp, + void *lp_arg) +{ + struct fc_lport *lport = lp_arg; + struct fc_frame_header *fh; + struct fc_els_flogi *flp; + u32 did; + u16 csp_flags; + unsigned int r_a_tov; + unsigned int e_d_tov; + u16 mfs; + + if (fp == ERR_PTR(-FC_EX_CLOSED)) + return; + + mutex_lock(&lport->lp_mutex); + + FC_DEBUG_LPORT("Received a FLOGI response\n"); + + if (lport->state != LPORT_ST_FLOGI) { + FC_DBG("Received a FLOGI response, but in state %s\n", + fc_lport_state(lport)); + goto out; + } + + if (IS_ERR(fp)) { + fc_lport_error(lport, fp); + goto err; + } + + fh = fc_frame_header_get(fp); + did = ntoh24(fh->fh_d_id); + if (fc_frame_payload_op(fp) == ELS_LS_ACC && did != 0) { + + FC_DEBUG_LPORT("Assigned fid %x\n", did); + fc_host_port_id(lport->host) = did; + + flp = fc_frame_payload_get(fp, sizeof(*flp)); + if (flp) { + mfs = ntohs(flp->fl_csp.sp_bb_data) & + FC_SP_BB_DATA_MASK; + if (mfs >= FC_SP_MIN_MAX_PAYLOAD && + mfs < lport->mfs) + lport->mfs = mfs; + csp_flags = ntohs(flp->fl_csp.sp_features); + r_a_tov = ntohl(flp->fl_csp.sp_r_a_tov); + e_d_tov = ntohl(flp->fl_csp.sp_e_d_tov); + if (csp_flags & FC_SP_FT_EDTR) + e_d_tov /= 1000000; + if ((csp_flags & FC_SP_FT_FPORT) == 0) { + if (e_d_tov > lport->e_d_tov) + lport->e_d_tov = e_d_tov; + lport->r_a_tov = 2 * e_d_tov; + FC_DBG("Point-to-Point mode\n"); + fc_lport_ptp_setup(lport, ntoh24(fh->fh_s_id), + get_unaligned_be64( + &flp->fl_wwpn), + get_unaligned_be64( + &flp->fl_wwnn)); + } else { + lport->e_d_tov = e_d_tov; + lport->r_a_tov = r_a_tov; + fc_host_fabric_name(lport->host) = + get_unaligned_be64(&flp->fl_wwnn); + fc_lport_enter_dns(lport); + } + } + + if (flp) { + csp_flags = ntohs(flp->fl_csp.sp_features); + if ((csp_flags & FC_SP_FT_FPORT) == 0) { + lport->tt.disc_start(fc_lport_disc_callback, + lport); + } + } + } else { + FC_DBG("bad FLOGI response\n"); + } + +out: + fc_frame_free(fp); +err: + mutex_unlock(&lport->lp_mutex); +} + +/** + * fc_rport_enter_flogi - Send a FLOGI request to the fabric manager + * @lport: Fibre Channel local port to be logged in to the fabric + * + * Locking Note: The lport lock is expected to be held before calling + * this routine. + */ +void fc_lport_enter_flogi(struct fc_lport *lport) +{ + struct fc_frame *fp; + + FC_DEBUG_LPORT("Processing FLOGI state\n"); + + fc_lport_state_enter(lport, LPORT_ST_FLOGI); + + fp = fc_frame_alloc(lport, sizeof(struct fc_els_flogi)); + if (!fp) + return fc_lport_error(lport, fp); + + if (!lport->tt.elsct_send(lport, NULL, fp, ELS_FLOGI, + fc_lport_flogi_resp, lport, lport->e_d_tov)) + fc_lport_error(lport, fp); +} + +/* Configure a fc_lport */ +int fc_lport_config(struct fc_lport *lport) +{ + INIT_DELAYED_WORK(&lport->retry_work, fc_lport_timeout); + mutex_init(&lport->lp_mutex); + + fc_lport_state_enter(lport, LPORT_ST_NONE); + + fc_lport_add_fc4_type(lport, FC_TYPE_FCP); + fc_lport_add_fc4_type(lport, FC_TYPE_CT); + + return 0; +} +EXPORT_SYMBOL(fc_lport_config); + +int fc_lport_init(struct fc_lport *lport) +{ + if (!lport->tt.lport_recv) + lport->tt.lport_recv = fc_lport_recv_req; + + if (!lport->tt.lport_reset) + lport->tt.lport_reset = fc_lport_reset; + + fc_host_port_type(lport->host) = FC_PORTTYPE_NPORT; + fc_host_node_name(lport->host) = lport->wwnn; + fc_host_port_name(lport->host) = lport->wwpn; + fc_host_supported_classes(lport->host) = FC_COS_CLASS3; + memset(fc_host_supported_fc4s(lport->host), 0, + sizeof(fc_host_supported_fc4s(lport->host))); + fc_host_supported_fc4s(lport->host)[2] = 1; + fc_host_supported_fc4s(lport->host)[7] = 1; + + /* This value is also unchanging */ + memset(fc_host_active_fc4s(lport->host), 0, + sizeof(fc_host_active_fc4s(lport->host))); + fc_host_active_fc4s(lport->host)[2] = 1; + fc_host_active_fc4s(lport->host)[7] = 1; + fc_host_maxframe_size(lport->host) = lport->mfs; + fc_host_supported_speeds(lport->host) = 0; + if (lport->link_supported_speeds & FC_PORTSPEED_1GBIT) + fc_host_supported_speeds(lport->host) |= FC_PORTSPEED_1GBIT; + if (lport->link_supported_speeds & FC_PORTSPEED_10GBIT) + fc_host_supported_speeds(lport->host) |= FC_PORTSPEED_10GBIT; + + return 0; +} +EXPORT_SYMBOL(fc_lport_init); diff --git a/drivers/scsi/libfc/fc_rport.c b/drivers/scsi/libfc/fc_rport.c new file mode 100644 index 00000000000..e780d8caf70 --- /dev/null +++ b/drivers/scsi/libfc/fc_rport.c @@ -0,0 +1,1291 @@ +/* + * Copyright(c) 2007 - 2008 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + * + * Maintained at www.Open-FCoE.org + */ + +/* + * RPORT GENERAL INFO + * + * This file contains all processing regarding fc_rports. It contains the + * rport state machine and does all rport interaction with the transport class. + * There should be no other places in libfc that interact directly with the + * transport class in regards to adding and deleting rports. + * + * fc_rport's represent N_Port's within the fabric. + */ + +/* + * RPORT LOCKING + * + * The rport should never hold the rport mutex and then attempt to acquire + * either the lport or disc mutexes. The rport's mutex is considered lesser + * than both the lport's mutex and the disc mutex. Refer to fc_lport.c for + * more comments on the heirarchy. + * + * The locking strategy is similar to the lport's strategy. The lock protects + * the rport's states and is held and released by the entry points to the rport + * block. All _enter_* functions correspond to rport states and expect the rport + * mutex to be locked before calling them. This means that rports only handle + * one request or response at a time, since they're not critical for the I/O + * path this potential over-use of the mutex is acceptable. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +static int fc_rport_debug; + +#define FC_DEBUG_RPORT(fmt...) \ + do { \ + if (fc_rport_debug) \ + FC_DBG(fmt); \ + } while (0) + +struct workqueue_struct *rport_event_queue; + +static void fc_rport_enter_plogi(struct fc_rport *); +static void fc_rport_enter_prli(struct fc_rport *); +static void fc_rport_enter_rtv(struct fc_rport *); +static void fc_rport_enter_ready(struct fc_rport *); +static void fc_rport_enter_logo(struct fc_rport *); + +static void fc_rport_recv_plogi_req(struct fc_rport *, + struct fc_seq *, struct fc_frame *); +static void fc_rport_recv_prli_req(struct fc_rport *, + struct fc_seq *, struct fc_frame *); +static void fc_rport_recv_prlo_req(struct fc_rport *, + struct fc_seq *, struct fc_frame *); +static void fc_rport_recv_logo_req(struct fc_rport *, + struct fc_seq *, struct fc_frame *); +static void fc_rport_timeout(struct work_struct *); +static void fc_rport_error(struct fc_rport *, struct fc_frame *); +static void fc_rport_work(struct work_struct *); + +static const char *fc_rport_state_names[] = { + [RPORT_ST_NONE] = "None", + [RPORT_ST_INIT] = "Init", + [RPORT_ST_PLOGI] = "PLOGI", + [RPORT_ST_PRLI] = "PRLI", + [RPORT_ST_RTV] = "RTV", + [RPORT_ST_READY] = "Ready", + [RPORT_ST_LOGO] = "LOGO", +}; + +static void fc_rport_rogue_destroy(struct device *dev) +{ + struct fc_rport *rport = dev_to_rport(dev); + FC_DEBUG_RPORT("Destroying rogue rport (%6x)\n", rport->port_id); + kfree(rport); +} + +struct fc_rport *fc_rport_rogue_create(struct fc_disc_port *dp) +{ + struct fc_rport *rport; + struct fc_rport_libfc_priv *rdata; + rport = kzalloc(sizeof(*rport) + sizeof(*rdata), GFP_KERNEL); + + if (!rport) + return NULL; + + rdata = RPORT_TO_PRIV(rport); + + rport->dd_data = rdata; + rport->port_id = dp->ids.port_id; + rport->port_name = dp->ids.port_name; + rport->node_name = dp->ids.node_name; + rport->roles = dp->ids.roles; + rport->maxframe_size = FC_MIN_MAX_PAYLOAD; + /* + * Note: all this libfc rogue rport code will be removed for + * upstream so it fine that this is really ugly and hacky right now. + */ + device_initialize(&rport->dev); + rport->dev.release = fc_rport_rogue_destroy; + + mutex_init(&rdata->rp_mutex); + rdata->local_port = dp->lp; + rdata->trans_state = FC_PORTSTATE_ROGUE; + rdata->rp_state = RPORT_ST_INIT; + rdata->event = RPORT_EV_NONE; + rdata->flags = FC_RP_FLAGS_REC_SUPPORTED; + rdata->ops = NULL; + rdata->e_d_tov = dp->lp->e_d_tov; + rdata->r_a_tov = dp->lp->r_a_tov; + INIT_DELAYED_WORK(&rdata->retry_work, fc_rport_timeout); + INIT_WORK(&rdata->event_work, fc_rport_work); + /* + * For good measure, but not necessary as we should only + * add REAL rport to the lport list. + */ + INIT_LIST_HEAD(&rdata->peers); + + return rport; +} + +/** + * fc_rport_state - return a string for the state the rport is in + * @rport: The rport whose state we want to get a string for + */ +static const char *fc_rport_state(struct fc_rport *rport) +{ + const char *cp; + struct fc_rport_libfc_priv *rdata = rport->dd_data; + + cp = fc_rport_state_names[rdata->rp_state]; + if (!cp) + cp = "Unknown"; + return cp; +} + +/** + * fc_set_rport_loss_tmo - Set the remote port loss timeout in seconds. + * @rport: Pointer to Fibre Channel remote port structure + * @timeout: timeout in seconds + */ +void fc_set_rport_loss_tmo(struct fc_rport *rport, u32 timeout) +{ + if (timeout) + rport->dev_loss_tmo = timeout + 5; + else + rport->dev_loss_tmo = 30; +} +EXPORT_SYMBOL(fc_set_rport_loss_tmo); + +/** + * fc_plogi_get_maxframe - Get max payload from the common service parameters + * @flp: FLOGI payload structure + * @maxval: upper limit, may be less than what is in the service parameters + */ +static unsigned int +fc_plogi_get_maxframe(struct fc_els_flogi *flp, unsigned int maxval) +{ + unsigned int mfs; + + /* + * Get max payload from the common service parameters and the + * class 3 receive data field size. + */ + mfs = ntohs(flp->fl_csp.sp_bb_data) & FC_SP_BB_DATA_MASK; + if (mfs >= FC_SP_MIN_MAX_PAYLOAD && mfs < maxval) + maxval = mfs; + mfs = ntohs(flp->fl_cssp[3 - 1].cp_rdfs); + if (mfs >= FC_SP_MIN_MAX_PAYLOAD && mfs < maxval) + maxval = mfs; + return maxval; +} + +/** + * fc_rport_state_enter - Change the rport's state + * @rport: The rport whose state should change + * @new: The new state of the rport + * + * Locking Note: Called with the rport lock held + */ +static void fc_rport_state_enter(struct fc_rport *rport, + enum fc_rport_state new) +{ + struct fc_rport_libfc_priv *rdata = rport->dd_data; + if (rdata->rp_state != new) + rdata->retries = 0; + rdata->rp_state = new; +} + +static void fc_rport_work(struct work_struct *work) +{ + struct fc_rport_libfc_priv *rdata = + container_of(work, struct fc_rport_libfc_priv, event_work); + enum fc_rport_event event; + enum fc_rport_trans_state trans_state; + struct fc_lport *lport = rdata->local_port; + struct fc_rport_operations *rport_ops; + struct fc_rport *rport = PRIV_TO_RPORT(rdata); + + mutex_lock(&rdata->rp_mutex); + event = rdata->event; + rport_ops = rdata->ops; + + if (event == RPORT_EV_CREATED) { + struct fc_rport *new_rport; + struct fc_rport_libfc_priv *new_rdata; + struct fc_rport_identifiers ids; + + ids.port_id = rport->port_id; + ids.roles = rport->roles; + ids.port_name = rport->port_name; + ids.node_name = rport->node_name; + + mutex_unlock(&rdata->rp_mutex); + + new_rport = fc_remote_port_add(lport->host, 0, &ids); + if (new_rport) { + /* + * Switch from the rogue rport to the rport + * returned by the FC class. + */ + new_rport->maxframe_size = rport->maxframe_size; + + new_rdata = new_rport->dd_data; + new_rdata->e_d_tov = rdata->e_d_tov; + new_rdata->r_a_tov = rdata->r_a_tov; + new_rdata->ops = rdata->ops; + new_rdata->local_port = rdata->local_port; + new_rdata->flags = FC_RP_FLAGS_REC_SUPPORTED; + new_rdata->trans_state = FC_PORTSTATE_REAL; + mutex_init(&new_rdata->rp_mutex); + INIT_DELAYED_WORK(&new_rdata->retry_work, + fc_rport_timeout); + INIT_LIST_HEAD(&new_rdata->peers); + INIT_WORK(&new_rdata->event_work, fc_rport_work); + + fc_rport_state_enter(new_rport, RPORT_ST_READY); + } else { + FC_DBG("Failed to create the rport for port " + "(%6x).\n", ids.port_id); + event = RPORT_EV_FAILED; + } + put_device(&rport->dev); + rport = new_rport; + rdata = new_rport->dd_data; + if (rport_ops->event_callback) + rport_ops->event_callback(lport, rport, event); + } else if ((event == RPORT_EV_FAILED) || + (event == RPORT_EV_LOGO) || + (event == RPORT_EV_STOP)) { + trans_state = rdata->trans_state; + mutex_unlock(&rdata->rp_mutex); + if (rport_ops->event_callback) + rport_ops->event_callback(lport, rport, event); + if (trans_state == FC_PORTSTATE_ROGUE) + put_device(&rport->dev); + else + fc_remote_port_delete(rport); + } else + mutex_unlock(&rdata->rp_mutex); +} + +/** + * fc_rport_login - Start the remote port login state machine + * @rport: Fibre Channel remote port + * + * Locking Note: Called without the rport lock held. This + * function will hold the rport lock, call an _enter_* + * function and then unlock the rport. + */ +int fc_rport_login(struct fc_rport *rport) +{ + struct fc_rport_libfc_priv *rdata = rport->dd_data; + + mutex_lock(&rdata->rp_mutex); + + FC_DEBUG_RPORT("Login to port (%6x)\n", rport->port_id); + + fc_rport_enter_plogi(rport); + + mutex_unlock(&rdata->rp_mutex); + + return 0; +} + +/** + * fc_rport_logoff - Logoff and remove an rport + * @rport: Fibre Channel remote port to be removed + * + * Locking Note: Called without the rport lock held. This + * function will hold the rport lock, call an _enter_* + * function and then unlock the rport. + */ +int fc_rport_logoff(struct fc_rport *rport) +{ + struct fc_rport_libfc_priv *rdata = rport->dd_data; + + mutex_lock(&rdata->rp_mutex); + + FC_DEBUG_RPORT("Remove port (%6x)\n", rport->port_id); + + fc_rport_enter_logo(rport); + + /* + * Change the state to NONE so that we discard + * the response. + */ + fc_rport_state_enter(rport, RPORT_ST_NONE); + + mutex_unlock(&rdata->rp_mutex); + + cancel_delayed_work_sync(&rdata->retry_work); + + mutex_lock(&rdata->rp_mutex); + + rdata->event = RPORT_EV_STOP; + queue_work(rport_event_queue, &rdata->event_work); + + mutex_unlock(&rdata->rp_mutex); + + return 0; +} + +/** + * fc_rport_enter_ready - The rport is ready + * @rport: Fibre Channel remote port that is ready + * + * Locking Note: The rport lock is expected to be held before calling + * this routine. + */ +static void fc_rport_enter_ready(struct fc_rport *rport) +{ + struct fc_rport_libfc_priv *rdata = rport->dd_data; + + fc_rport_state_enter(rport, RPORT_ST_READY); + + FC_DEBUG_RPORT("Port (%6x) is Ready\n", rport->port_id); + + rdata->event = RPORT_EV_CREATED; + queue_work(rport_event_queue, &rdata->event_work); +} + +/** + * fc_rport_timeout - Handler for the retry_work timer. + * @work: The work struct of the fc_rport_libfc_priv + * + * Locking Note: Called without the rport lock held. This + * function will hold the rport lock, call an _enter_* + * function and then unlock the rport. + */ +static void fc_rport_timeout(struct work_struct *work) +{ + struct fc_rport_libfc_priv *rdata = + container_of(work, struct fc_rport_libfc_priv, retry_work.work); + struct fc_rport *rport = PRIV_TO_RPORT(rdata); + + mutex_lock(&rdata->rp_mutex); + + switch (rdata->rp_state) { + case RPORT_ST_PLOGI: + fc_rport_enter_plogi(rport); + break; + case RPORT_ST_PRLI: + fc_rport_enter_prli(rport); + break; + case RPORT_ST_RTV: + fc_rport_enter_rtv(rport); + break; + case RPORT_ST_LOGO: + fc_rport_enter_logo(rport); + break; + case RPORT_ST_READY: + case RPORT_ST_INIT: + case RPORT_ST_NONE: + break; + } + + mutex_unlock(&rdata->rp_mutex); + put_device(&rport->dev); +} + +/** + * fc_rport_error - Handler for any errors + * @rport: The fc_rport object + * @fp: The frame pointer + * + * If the error was caused by a resource allocation failure + * then wait for half a second and retry, otherwise retry + * immediately. + * + * Locking Note: The rport lock is expected to be held before + * calling this routine + */ +static void fc_rport_error(struct fc_rport *rport, struct fc_frame *fp) +{ + struct fc_rport_libfc_priv *rdata = rport->dd_data; + unsigned long delay = 0; + + FC_DEBUG_RPORT("Error %ld in state %s, retries %d\n", + PTR_ERR(fp), fc_rport_state(rport), rdata->retries); + + if (!fp || PTR_ERR(fp) == -FC_EX_TIMEOUT) { + /* + * Memory allocation failure, or the exchange timed out. + * Retry after delay + */ + if (rdata->retries < rdata->local_port->max_retry_count) { + rdata->retries++; + if (!fp) + delay = msecs_to_jiffies(500); + get_device(&rport->dev); + schedule_delayed_work(&rdata->retry_work, delay); + } else { + switch (rdata->rp_state) { + case RPORT_ST_PLOGI: + case RPORT_ST_PRLI: + case RPORT_ST_LOGO: + rdata->event = RPORT_EV_FAILED; + queue_work(rport_event_queue, + &rdata->event_work); + break; + case RPORT_ST_RTV: + fc_rport_enter_ready(rport); + break; + case RPORT_ST_NONE: + case RPORT_ST_READY: + case RPORT_ST_INIT: + break; + } + } + } +} + +/** + * fc_rport_plogi_recv_resp - Handle incoming ELS PLOGI response + * @sp: current sequence in the PLOGI exchange + * @fp: response frame + * @rp_arg: Fibre Channel remote port + * + * Locking Note: This function will be called without the rport lock + * held, but it will lock, call an _enter_* function or fc_rport_error + * and then unlock the rport. + */ +static void fc_rport_plogi_resp(struct fc_seq *sp, struct fc_frame *fp, + void *rp_arg) +{ + struct fc_rport *rport = rp_arg; + struct fc_rport_libfc_priv *rdata = rport->dd_data; + struct fc_lport *lport = rdata->local_port; + struct fc_els_flogi *plp; + unsigned int tov; + u16 csp_seq; + u16 cssp_seq; + u8 op; + + mutex_lock(&rdata->rp_mutex); + + FC_DEBUG_RPORT("Received a PLOGI response from port (%6x)\n", + rport->port_id); + + if (rdata->rp_state != RPORT_ST_PLOGI) { + FC_DBG("Received a PLOGI response, but in state %s\n", + fc_rport_state(rport)); + goto out; + } + + if (IS_ERR(fp)) { + fc_rport_error(rport, fp); + goto err; + } + + op = fc_frame_payload_op(fp); + if (op == ELS_LS_ACC && + (plp = fc_frame_payload_get(fp, sizeof(*plp))) != NULL) { + rport->port_name = get_unaligned_be64(&plp->fl_wwpn); + rport->node_name = get_unaligned_be64(&plp->fl_wwnn); + + tov = ntohl(plp->fl_csp.sp_e_d_tov); + if (ntohs(plp->fl_csp.sp_features) & FC_SP_FT_EDTR) + tov /= 1000; + if (tov > rdata->e_d_tov) + rdata->e_d_tov = tov; + csp_seq = ntohs(plp->fl_csp.sp_tot_seq); + cssp_seq = ntohs(plp->fl_cssp[3 - 1].cp_con_seq); + if (cssp_seq < csp_seq) + csp_seq = cssp_seq; + rdata->max_seq = csp_seq; + rport->maxframe_size = + fc_plogi_get_maxframe(plp, lport->mfs); + + /* + * If the rport is one of the well known addresses + * we skip PRLI and RTV and go straight to READY. + */ + if (rport->port_id >= FC_FID_DOM_MGR) + fc_rport_enter_ready(rport); + else + fc_rport_enter_prli(rport); + } else + fc_rport_error(rport, fp); + +out: + fc_frame_free(fp); +err: + mutex_unlock(&rdata->rp_mutex); + put_device(&rport->dev); +} + +/** + * fc_rport_enter_plogi - Send Port Login (PLOGI) request to peer + * @rport: Fibre Channel remote port to send PLOGI to + * + * Locking Note: The rport lock is expected to be held before calling + * this routine. + */ +static void fc_rport_enter_plogi(struct fc_rport *rport) +{ + struct fc_rport_libfc_priv *rdata = rport->dd_data; + struct fc_lport *lport = rdata->local_port; + struct fc_frame *fp; + + FC_DEBUG_RPORT("Port (%6x) entered PLOGI state from %s state\n", + rport->port_id, fc_rport_state(rport)); + + fc_rport_state_enter(rport, RPORT_ST_PLOGI); + + rport->maxframe_size = FC_MIN_MAX_PAYLOAD; + fp = fc_frame_alloc(lport, sizeof(struct fc_els_flogi)); + if (!fp) { + fc_rport_error(rport, fp); + return; + } + rdata->e_d_tov = lport->e_d_tov; + + if (!lport->tt.elsct_send(lport, rport, fp, ELS_PLOGI, + fc_rport_plogi_resp, rport, lport->e_d_tov)) + fc_rport_error(rport, fp); + else + get_device(&rport->dev); +} + +/** + * fc_rport_prli_resp - Process Login (PRLI) response handler + * @sp: current sequence in the PRLI exchange + * @fp: response frame + * @rp_arg: Fibre Channel remote port + * + * Locking Note: This function will be called without the rport lock + * held, but it will lock, call an _enter_* function or fc_rport_error + * and then unlock the rport. + */ +static void fc_rport_prli_resp(struct fc_seq *sp, struct fc_frame *fp, + void *rp_arg) +{ + struct fc_rport *rport = rp_arg; + struct fc_rport_libfc_priv *rdata = rport->dd_data; + struct { + struct fc_els_prli prli; + struct fc_els_spp spp; + } *pp; + u32 roles = FC_RPORT_ROLE_UNKNOWN; + u32 fcp_parm = 0; + u8 op; + + mutex_lock(&rdata->rp_mutex); + + FC_DEBUG_RPORT("Received a PRLI response from port (%6x)\n", + rport->port_id); + + if (rdata->rp_state != RPORT_ST_PRLI) { + FC_DBG("Received a PRLI response, but in state %s\n", + fc_rport_state(rport)); + goto out; + } + + if (IS_ERR(fp)) { + fc_rport_error(rport, fp); + goto err; + } + + op = fc_frame_payload_op(fp); + if (op == ELS_LS_ACC) { + pp = fc_frame_payload_get(fp, sizeof(*pp)); + if (pp && pp->prli.prli_spp_len >= sizeof(pp->spp)) { + fcp_parm = ntohl(pp->spp.spp_params); + if (fcp_parm & FCP_SPPF_RETRY) + rdata->flags |= FC_RP_FLAGS_RETRY; + } + + rport->supported_classes = FC_COS_CLASS3; + if (fcp_parm & FCP_SPPF_INIT_FCN) + roles |= FC_RPORT_ROLE_FCP_INITIATOR; + if (fcp_parm & FCP_SPPF_TARG_FCN) + roles |= FC_RPORT_ROLE_FCP_TARGET; + + rport->roles = roles; + fc_rport_enter_rtv(rport); + + } else { + FC_DBG("Bad ELS response\n"); + rdata->event = RPORT_EV_FAILED; + queue_work(rport_event_queue, &rdata->event_work); + } + +out: + fc_frame_free(fp); +err: + mutex_unlock(&rdata->rp_mutex); + put_device(&rport->dev); +} + +/** + * fc_rport_logo_resp - Logout (LOGO) response handler + * @sp: current sequence in the LOGO exchange + * @fp: response frame + * @rp_arg: Fibre Channel remote port + * + * Locking Note: This function will be called without the rport lock + * held, but it will lock, call an _enter_* function or fc_rport_error + * and then unlock the rport. + */ +static void fc_rport_logo_resp(struct fc_seq *sp, struct fc_frame *fp, + void *rp_arg) +{ + struct fc_rport *rport = rp_arg; + struct fc_rport_libfc_priv *rdata = rport->dd_data; + u8 op; + + mutex_lock(&rdata->rp_mutex); + + FC_DEBUG_RPORT("Received a LOGO response from port (%6x)\n", + rport->port_id); + + if (IS_ERR(fp)) { + fc_rport_error(rport, fp); + goto err; + } + + if (rdata->rp_state != RPORT_ST_LOGO) { + FC_DEBUG_RPORT("Received a LOGO response, but in state %s\n", + fc_rport_state(rport)); + goto out; + } + + op = fc_frame_payload_op(fp); + if (op == ELS_LS_ACC) { + fc_rport_enter_rtv(rport); + } else { + FC_DBG("Bad ELS response\n"); + rdata->event = RPORT_EV_LOGO; + queue_work(rport_event_queue, &rdata->event_work); + } + +out: + fc_frame_free(fp); +err: + mutex_unlock(&rdata->rp_mutex); + put_device(&rport->dev); +} + +/** + * fc_rport_enter_prli - Send Process Login (PRLI) request to peer + * @rport: Fibre Channel remote port to send PRLI to + * + * Locking Note: The rport lock is expected to be held before calling + * this routine. + */ +static void fc_rport_enter_prli(struct fc_rport *rport) +{ + struct fc_rport_libfc_priv *rdata = rport->dd_data; + struct fc_lport *lport = rdata->local_port; + struct { + struct fc_els_prli prli; + struct fc_els_spp spp; + } *pp; + struct fc_frame *fp; + + FC_DEBUG_RPORT("Port (%6x) entered PRLI state from %s state\n", + rport->port_id, fc_rport_state(rport)); + + fc_rport_state_enter(rport, RPORT_ST_PRLI); + + fp = fc_frame_alloc(lport, sizeof(*pp)); + if (!fp) { + fc_rport_error(rport, fp); + return; + } + + if (!lport->tt.elsct_send(lport, rport, fp, ELS_PRLI, + fc_rport_prli_resp, rport, lport->e_d_tov)) + fc_rport_error(rport, fp); + else + get_device(&rport->dev); +} + +/** + * fc_rport_els_rtv_resp - Request Timeout Value response handler + * @sp: current sequence in the RTV exchange + * @fp: response frame + * @rp_arg: Fibre Channel remote port + * + * Many targets don't seem to support this. + * + * Locking Note: This function will be called without the rport lock + * held, but it will lock, call an _enter_* function or fc_rport_error + * and then unlock the rport. + */ +static void fc_rport_rtv_resp(struct fc_seq *sp, struct fc_frame *fp, + void *rp_arg) +{ + struct fc_rport *rport = rp_arg; + struct fc_rport_libfc_priv *rdata = rport->dd_data; + u8 op; + + mutex_lock(&rdata->rp_mutex); + + FC_DEBUG_RPORT("Received a RTV response from port (%6x)\n", + rport->port_id); + + if (rdata->rp_state != RPORT_ST_RTV) { + FC_DBG("Received a RTV response, but in state %s\n", + fc_rport_state(rport)); + goto out; + } + + if (IS_ERR(fp)) { + fc_rport_error(rport, fp); + goto err; + } + + op = fc_frame_payload_op(fp); + if (op == ELS_LS_ACC) { + struct fc_els_rtv_acc *rtv; + u32 toq; + u32 tov; + + rtv = fc_frame_payload_get(fp, sizeof(*rtv)); + if (rtv) { + toq = ntohl(rtv->rtv_toq); + tov = ntohl(rtv->rtv_r_a_tov); + if (tov == 0) + tov = 1; + rdata->r_a_tov = tov; + tov = ntohl(rtv->rtv_e_d_tov); + if (toq & FC_ELS_RTV_EDRES) + tov /= 1000000; + if (tov == 0) + tov = 1; + rdata->e_d_tov = tov; + } + } + + fc_rport_enter_ready(rport); + +out: + fc_frame_free(fp); +err: + mutex_unlock(&rdata->rp_mutex); + put_device(&rport->dev); +} + +/** + * fc_rport_enter_rtv - Send Request Timeout Value (RTV) request to peer + * @rport: Fibre Channel remote port to send RTV to + * + * Locking Note: The rport lock is expected to be held before calling + * this routine. + */ +static void fc_rport_enter_rtv(struct fc_rport *rport) +{ + struct fc_frame *fp; + struct fc_rport_libfc_priv *rdata = rport->dd_data; + struct fc_lport *lport = rdata->local_port; + + FC_DEBUG_RPORT("Port (%6x) entered RTV state from %s state\n", + rport->port_id, fc_rport_state(rport)); + + fc_rport_state_enter(rport, RPORT_ST_RTV); + + fp = fc_frame_alloc(lport, sizeof(struct fc_els_rtv)); + if (!fp) { + fc_rport_error(rport, fp); + return; + } + + if (!lport->tt.elsct_send(lport, rport, fp, ELS_RTV, + fc_rport_rtv_resp, rport, lport->e_d_tov)) + fc_rport_error(rport, fp); + else + get_device(&rport->dev); +} + +/** + * fc_rport_enter_logo - Send Logout (LOGO) request to peer + * @rport: Fibre Channel remote port to send LOGO to + * + * Locking Note: The rport lock is expected to be held before calling + * this routine. + */ +static void fc_rport_enter_logo(struct fc_rport *rport) +{ + struct fc_rport_libfc_priv *rdata = rport->dd_data; + struct fc_lport *lport = rdata->local_port; + struct fc_frame *fp; + + FC_DEBUG_RPORT("Port (%6x) entered LOGO state from %s state\n", + rport->port_id, fc_rport_state(rport)); + + fc_rport_state_enter(rport, RPORT_ST_LOGO); + + fp = fc_frame_alloc(lport, sizeof(struct fc_els_logo)); + if (!fp) { + fc_rport_error(rport, fp); + return; + } + + if (!lport->tt.elsct_send(lport, rport, fp, ELS_LOGO, + fc_rport_logo_resp, rport, lport->e_d_tov)) + fc_rport_error(rport, fp); + else + get_device(&rport->dev); +} + + +/** + * fc_rport_recv_req - Receive a request from a rport + * @sp: current sequence in the PLOGI exchange + * @fp: response frame + * @rp_arg: Fibre Channel remote port + * + * Locking Note: Called without the rport lock held. This + * function will hold the rport lock, call an _enter_* + * function and then unlock the rport. + */ +void fc_rport_recv_req(struct fc_seq *sp, struct fc_frame *fp, + struct fc_rport *rport) +{ + struct fc_rport_libfc_priv *rdata = rport->dd_data; + struct fc_lport *lport = rdata->local_port; + + struct fc_frame_header *fh; + struct fc_seq_els_data els_data; + u8 op; + + mutex_lock(&rdata->rp_mutex); + + els_data.fp = NULL; + els_data.explan = ELS_EXPL_NONE; + els_data.reason = ELS_RJT_NONE; + + fh = fc_frame_header_get(fp); + + if (fh->fh_r_ctl == FC_RCTL_ELS_REQ && fh->fh_type == FC_TYPE_ELS) { + op = fc_frame_payload_op(fp); + switch (op) { + case ELS_PLOGI: + fc_rport_recv_plogi_req(rport, sp, fp); + break; + case ELS_PRLI: + fc_rport_recv_prli_req(rport, sp, fp); + break; + case ELS_PRLO: + fc_rport_recv_prlo_req(rport, sp, fp); + break; + case ELS_LOGO: + fc_rport_recv_logo_req(rport, sp, fp); + break; + case ELS_RRQ: + els_data.fp = fp; + lport->tt.seq_els_rsp_send(sp, ELS_RRQ, &els_data); + break; + case ELS_REC: + els_data.fp = fp; + lport->tt.seq_els_rsp_send(sp, ELS_REC, &els_data); + break; + default: + els_data.reason = ELS_RJT_UNSUP; + lport->tt.seq_els_rsp_send(sp, ELS_LS_RJT, &els_data); + break; + } + } + + mutex_unlock(&rdata->rp_mutex); +} + +/** + * fc_rport_recv_plogi_req - Handle incoming Port Login (PLOGI) request + * @rport: Fibre Channel remote port that initiated PLOGI + * @sp: current sequence in the PLOGI exchange + * @fp: PLOGI request frame + * + * Locking Note: The rport lock is exected to be held before calling + * this function. + */ +static void fc_rport_recv_plogi_req(struct fc_rport *rport, + struct fc_seq *sp, struct fc_frame *rx_fp) +{ + struct fc_rport_libfc_priv *rdata = rport->dd_data; + struct fc_lport *lport = rdata->local_port; + struct fc_frame *fp = rx_fp; + struct fc_exch *ep; + struct fc_frame_header *fh; + struct fc_els_flogi *pl; + struct fc_seq_els_data rjt_data; + u32 sid; + u64 wwpn; + u64 wwnn; + enum fc_els_rjt_reason reject = 0; + u32 f_ctl; + rjt_data.fp = NULL; + + fh = fc_frame_header_get(fp); + + FC_DEBUG_RPORT("Received PLOGI request from port (%6x) " + "while in state %s\n", ntoh24(fh->fh_s_id), + fc_rport_state(rport)); + + sid = ntoh24(fh->fh_s_id); + pl = fc_frame_payload_get(fp, sizeof(*pl)); + if (!pl) { + FC_DBG("incoming PLOGI from %x too short\n", sid); + WARN_ON(1); + /* XXX TBD: send reject? */ + fc_frame_free(fp); + return; + } + wwpn = get_unaligned_be64(&pl->fl_wwpn); + wwnn = get_unaligned_be64(&pl->fl_wwnn); + + /* + * If the session was just created, possibly due to the incoming PLOGI, + * set the state appropriately and accept the PLOGI. + * + * If we had also sent a PLOGI, and if the received PLOGI is from a + * higher WWPN, we accept it, otherwise an LS_RJT is sent with reason + * "command already in progress". + * + * XXX TBD: If the session was ready before, the PLOGI should result in + * all outstanding exchanges being reset. + */ + switch (rdata->rp_state) { + case RPORT_ST_INIT: + FC_DEBUG_RPORT("incoming PLOGI from %6x wwpn %llx state INIT " + "- reject\n", sid, wwpn); + reject = ELS_RJT_UNSUP; + break; + case RPORT_ST_PLOGI: + FC_DEBUG_RPORT("incoming PLOGI from %x in PLOGI state %d\n", + sid, rdata->rp_state); + if (wwpn < lport->wwpn) + reject = ELS_RJT_INPROG; + break; + case RPORT_ST_PRLI: + case RPORT_ST_READY: + FC_DEBUG_RPORT("incoming PLOGI from %x in logged-in state %d " + "- ignored for now\n", sid, rdata->rp_state); + /* XXX TBD - should reset */ + break; + case RPORT_ST_NONE: + default: + FC_DEBUG_RPORT("incoming PLOGI from %x in unexpected " + "state %d\n", sid, rdata->rp_state); + break; + } + + if (reject) { + rjt_data.reason = reject; + rjt_data.explan = ELS_EXPL_NONE; + lport->tt.seq_els_rsp_send(sp, ELS_LS_RJT, &rjt_data); + fc_frame_free(fp); + } else { + fp = fc_frame_alloc(lport, sizeof(*pl)); + if (fp == NULL) { + fp = rx_fp; + rjt_data.reason = ELS_RJT_UNAB; + rjt_data.explan = ELS_EXPL_NONE; + lport->tt.seq_els_rsp_send(sp, ELS_LS_RJT, &rjt_data); + fc_frame_free(fp); + } else { + sp = lport->tt.seq_start_next(sp); + WARN_ON(!sp); + fc_rport_set_name(rport, wwpn, wwnn); + + /* + * Get session payload size from incoming PLOGI. + */ + rport->maxframe_size = + fc_plogi_get_maxframe(pl, lport->mfs); + fc_frame_free(rx_fp); + fc_plogi_fill(lport, fp, ELS_LS_ACC); + + /* + * Send LS_ACC. If this fails, + * the originator should retry. + */ + f_ctl = FC_FC_EX_CTX | FC_FC_LAST_SEQ; + f_ctl |= FC_FC_END_SEQ | FC_FC_SEQ_INIT; + ep = fc_seq_exch(sp); + fc_fill_fc_hdr(fp, FC_RCTL_ELS_REP, ep->did, ep->sid, + FC_TYPE_ELS, f_ctl, 0); + lport->tt.seq_send(lport, sp, fp); + if (rdata->rp_state == RPORT_ST_PLOGI) + fc_rport_enter_prli(rport); + } + } +} + +/** + * fc_rport_recv_prli_req - Handle incoming Process Login (PRLI) request + * @rport: Fibre Channel remote port that initiated PRLI + * @sp: current sequence in the PRLI exchange + * @fp: PRLI request frame + * + * Locking Note: The rport lock is exected to be held before calling + * this function. + */ +static void fc_rport_recv_prli_req(struct fc_rport *rport, + struct fc_seq *sp, struct fc_frame *rx_fp) +{ + struct fc_rport_libfc_priv *rdata = rport->dd_data; + struct fc_lport *lport = rdata->local_port; + struct fc_exch *ep; + struct fc_frame *fp; + struct fc_frame_header *fh; + struct { + struct fc_els_prli prli; + struct fc_els_spp spp; + } *pp; + struct fc_els_spp *rspp; /* request service param page */ + struct fc_els_spp *spp; /* response spp */ + unsigned int len; + unsigned int plen; + enum fc_els_rjt_reason reason = ELS_RJT_UNAB; + enum fc_els_rjt_explan explan = ELS_EXPL_NONE; + enum fc_els_spp_resp resp; + struct fc_seq_els_data rjt_data; + u32 f_ctl; + u32 fcp_parm; + u32 roles = FC_RPORT_ROLE_UNKNOWN; + rjt_data.fp = NULL; + + fh = fc_frame_header_get(rx_fp); + + FC_DEBUG_RPORT("Received PRLI request from port (%6x) " + "while in state %s\n", ntoh24(fh->fh_s_id), + fc_rport_state(rport)); + + switch (rdata->rp_state) { + case RPORT_ST_PRLI: + case RPORT_ST_READY: + reason = ELS_RJT_NONE; + break; + default: + break; + } + len = fr_len(rx_fp) - sizeof(*fh); + pp = fc_frame_payload_get(rx_fp, sizeof(*pp)); + if (pp == NULL) { + reason = ELS_RJT_PROT; + explan = ELS_EXPL_INV_LEN; + } else { + plen = ntohs(pp->prli.prli_len); + if ((plen % 4) != 0 || plen > len) { + reason = ELS_RJT_PROT; + explan = ELS_EXPL_INV_LEN; + } else if (plen < len) { + len = plen; + } + plen = pp->prli.prli_spp_len; + if ((plen % 4) != 0 || plen < sizeof(*spp) || + plen > len || len < sizeof(*pp)) { + reason = ELS_RJT_PROT; + explan = ELS_EXPL_INV_LEN; + } + rspp = &pp->spp; + } + if (reason != ELS_RJT_NONE || + (fp = fc_frame_alloc(lport, len)) == NULL) { + rjt_data.reason = reason; + rjt_data.explan = explan; + lport->tt.seq_els_rsp_send(sp, ELS_LS_RJT, &rjt_data); + } else { + sp = lport->tt.seq_start_next(sp); + WARN_ON(!sp); + pp = fc_frame_payload_get(fp, len); + WARN_ON(!pp); + memset(pp, 0, len); + pp->prli.prli_cmd = ELS_LS_ACC; + pp->prli.prli_spp_len = plen; + pp->prli.prli_len = htons(len); + len -= sizeof(struct fc_els_prli); + + /* + * Go through all the service parameter pages and build + * response. If plen indicates longer SPP than standard, + * use that. The entire response has been pre-cleared above. + */ + spp = &pp->spp; + while (len >= plen) { + spp->spp_type = rspp->spp_type; + spp->spp_type_ext = rspp->spp_type_ext; + spp->spp_flags = rspp->spp_flags & FC_SPP_EST_IMG_PAIR; + resp = FC_SPP_RESP_ACK; + if (rspp->spp_flags & FC_SPP_RPA_VAL) + resp = FC_SPP_RESP_NO_PA; + switch (rspp->spp_type) { + case 0: /* common to all FC-4 types */ + break; + case FC_TYPE_FCP: + fcp_parm = ntohl(rspp->spp_params); + if (fcp_parm * FCP_SPPF_RETRY) + rdata->flags |= FC_RP_FLAGS_RETRY; + rport->supported_classes = FC_COS_CLASS3; + if (fcp_parm & FCP_SPPF_INIT_FCN) + roles |= FC_RPORT_ROLE_FCP_INITIATOR; + if (fcp_parm & FCP_SPPF_TARG_FCN) + roles |= FC_RPORT_ROLE_FCP_TARGET; + rport->roles = roles; + + spp->spp_params = + htonl(lport->service_params); + break; + default: + resp = FC_SPP_RESP_INVL; + break; + } + spp->spp_flags |= resp; + len -= plen; + rspp = (struct fc_els_spp *)((char *)rspp + plen); + spp = (struct fc_els_spp *)((char *)spp + plen); + } + + /* + * Send LS_ACC. If this fails, the originator should retry. + */ + f_ctl = FC_FC_EX_CTX | FC_FC_LAST_SEQ; + f_ctl |= FC_FC_END_SEQ | FC_FC_SEQ_INIT; + ep = fc_seq_exch(sp); + fc_fill_fc_hdr(fp, FC_RCTL_ELS_REP, ep->did, ep->sid, + FC_TYPE_ELS, f_ctl, 0); + lport->tt.seq_send(lport, sp, fp); + + /* + * Get lock and re-check state. + */ + switch (rdata->rp_state) { + case RPORT_ST_PRLI: + fc_rport_enter_ready(rport); + break; + case RPORT_ST_READY: + break; + default: + break; + } + } + fc_frame_free(rx_fp); +} + +/** + * fc_rport_recv_prlo_req - Handle incoming Process Logout (PRLO) request + * @rport: Fibre Channel remote port that initiated PRLO + * @sp: current sequence in the PRLO exchange + * @fp: PRLO request frame + * + * Locking Note: The rport lock is exected to be held before calling + * this function. + */ +static void fc_rport_recv_prlo_req(struct fc_rport *rport, struct fc_seq *sp, + struct fc_frame *fp) +{ + struct fc_rport_libfc_priv *rdata = rport->dd_data; + struct fc_lport *lport = rdata->local_port; + + struct fc_frame_header *fh; + struct fc_seq_els_data rjt_data; + + fh = fc_frame_header_get(fp); + + FC_DEBUG_RPORT("Received PRLO request from port (%6x) " + "while in state %s\n", ntoh24(fh->fh_s_id), + fc_rport_state(rport)); + + rjt_data.fp = NULL; + rjt_data.reason = ELS_RJT_UNAB; + rjt_data.explan = ELS_EXPL_NONE; + lport->tt.seq_els_rsp_send(sp, ELS_LS_RJT, &rjt_data); + fc_frame_free(fp); +} + +/** + * fc_rport_recv_logo_req - Handle incoming Logout (LOGO) request + * @rport: Fibre Channel remote port that initiated LOGO + * @sp: current sequence in the LOGO exchange + * @fp: LOGO request frame + * + * Locking Note: The rport lock is exected to be held before calling + * this function. + */ +static void fc_rport_recv_logo_req(struct fc_rport *rport, struct fc_seq *sp, + struct fc_frame *fp) +{ + struct fc_frame_header *fh; + struct fc_rport_libfc_priv *rdata = rport->dd_data; + struct fc_lport *lport = rdata->local_port; + + fh = fc_frame_header_get(fp); + + FC_DEBUG_RPORT("Received LOGO request from port (%6x) " + "while in state %s\n", ntoh24(fh->fh_s_id), + fc_rport_state(rport)); + + rdata->event = RPORT_EV_LOGO; + queue_work(rport_event_queue, &rdata->event_work); + + lport->tt.seq_els_rsp_send(sp, ELS_LS_ACC, NULL); + fc_frame_free(fp); +} + +static void fc_rport_flush_queue(void) +{ + flush_workqueue(rport_event_queue); +} + + +int fc_rport_init(struct fc_lport *lport) +{ + if (!lport->tt.rport_login) + lport->tt.rport_login = fc_rport_login; + + if (!lport->tt.rport_logoff) + lport->tt.rport_logoff = fc_rport_logoff; + + if (!lport->tt.rport_recv_req) + lport->tt.rport_recv_req = fc_rport_recv_req; + + if (!lport->tt.rport_flush_queue) + lport->tt.rport_flush_queue = fc_rport_flush_queue; + + return 0; +} +EXPORT_SYMBOL(fc_rport_init); + +int fc_setup_rport() +{ + rport_event_queue = create_singlethread_workqueue("fc_rport_eq"); + if (!rport_event_queue) + return -ENOMEM; + return 0; +} +EXPORT_SYMBOL(fc_setup_rport); + +void fc_destroy_rport() +{ + destroy_workqueue(rport_event_queue); +} +EXPORT_SYMBOL(fc_destroy_rport); + +void fc_rport_terminate_io(struct fc_rport *rport) +{ + struct fc_rport_libfc_priv *rdata = rport->dd_data; + struct fc_lport *lport = rdata->local_port; + + lport->tt.exch_mgr_reset(lport->emp, 0, rport->port_id); + lport->tt.exch_mgr_reset(lport->emp, rport->port_id, 0); +} +EXPORT_SYMBOL(fc_rport_terminate_io); diff --git a/include/scsi/fc_encode.h b/include/scsi/fc_encode.h new file mode 100644 index 00000000000..6300f556bce --- /dev/null +++ b/include/scsi/fc_encode.h @@ -0,0 +1,309 @@ +/* + * Copyright(c) 2008 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + * + * Maintained at www.Open-FCoE.org + */ + +#ifndef _FC_ENCODE_H_ +#define _FC_ENCODE_H_ +#include + +struct fc_ns_rft { + struct fc_ns_fid fid; /* port ID object */ + struct fc_ns_fts fts; /* FC4-types object */ +}; + +struct fc_ct_req { + struct fc_ct_hdr hdr; + union { + struct fc_ns_gid_ft gid; + struct fc_ns_rn_id rn; + struct fc_ns_rft rft; + } payload; +}; + +/** + * fill FC header fields in specified fc_frame + */ +static inline void fc_fill_fc_hdr(struct fc_frame *fp, enum fc_rctl r_ctl, + u32 did, u32 sid, enum fc_fh_type type, + u32 f_ctl, u32 parm_offset) +{ + struct fc_frame_header *fh; + + fh = fc_frame_header_get(fp); + WARN_ON(r_ctl == 0); + fh->fh_r_ctl = r_ctl; + hton24(fh->fh_d_id, did); + hton24(fh->fh_s_id, sid); + fh->fh_type = type; + hton24(fh->fh_f_ctl, f_ctl); + fh->fh_cs_ctl = 0; + fh->fh_df_ctl = 0; + fh->fh_parm_offset = htonl(parm_offset); +} + +/** + * fc_ct_hdr_fill- fills ct header and reset ct payload + * returns pointer to ct request. + */ +static inline struct fc_ct_req *fc_ct_hdr_fill(const struct fc_frame *fp, + unsigned int op, size_t req_size) +{ + struct fc_ct_req *ct; + size_t ct_plen; + + ct_plen = sizeof(struct fc_ct_hdr) + req_size; + ct = fc_frame_payload_get(fp, ct_plen); + memset(ct, 0, ct_plen); + ct->hdr.ct_rev = FC_CT_REV; + ct->hdr.ct_fs_type = FC_FST_DIR; + ct->hdr.ct_fs_subtype = FC_NS_SUBTYPE; + ct->hdr.ct_cmd = htons((u16) op); + return ct; +} + +/** + * fc_ct_fill - Fill in a name service request frame + */ +static inline int fc_ct_fill(struct fc_lport *lport, struct fc_frame *fp, + unsigned int op, enum fc_rctl *r_ctl, u32 *did, + enum fc_fh_type *fh_type) +{ + struct fc_ct_req *ct; + + switch (op) { + case FC_NS_GPN_FT: + ct = fc_ct_hdr_fill(fp, op, sizeof(struct fc_ns_gid_ft)); + ct->payload.gid.fn_fc4_type = FC_TYPE_FCP; + break; + + case FC_NS_RFT_ID: + ct = fc_ct_hdr_fill(fp, op, sizeof(struct fc_ns_rft)); + hton24(ct->payload.rft.fid.fp_fid, + fc_host_port_id(lport->host)); + ct->payload.rft.fts = lport->fcts; + break; + + case FC_NS_RPN_ID: + ct = fc_ct_hdr_fill(fp, op, sizeof(struct fc_ns_rn_id)); + hton24(ct->payload.rn.fr_fid.fp_fid, + fc_host_port_id(lport->host)); + ct->payload.rft.fts = lport->fcts; + put_unaligned_be64(lport->wwpn, &ct->payload.rn.fr_wwn); + break; + + default: + FC_DBG("Invalid op code %x \n", op); + return -EINVAL; + } + *r_ctl = FC_RCTL_DD_UNSOL_CTL; + *did = FC_FID_DIR_SERV; + *fh_type = FC_TYPE_CT; + return 0; +} + +/** + * fc_plogi_fill - Fill in plogi request frame + */ +static inline void fc_plogi_fill(struct fc_lport *lport, struct fc_frame *fp, + unsigned int op) +{ + struct fc_els_flogi *plogi; + struct fc_els_csp *csp; + struct fc_els_cssp *cp; + + plogi = fc_frame_payload_get(fp, sizeof(*plogi)); + memset(plogi, 0, sizeof(*plogi)); + plogi->fl_cmd = (u8) op; + put_unaligned_be64(lport->wwpn, &plogi->fl_wwpn); + put_unaligned_be64(lport->wwnn, &plogi->fl_wwnn); + + csp = &plogi->fl_csp; + csp->sp_hi_ver = 0x20; + csp->sp_lo_ver = 0x20; + csp->sp_bb_cred = htons(10); /* this gets set by gateway */ + csp->sp_bb_data = htons((u16) lport->mfs); + cp = &plogi->fl_cssp[3 - 1]; /* class 3 parameters */ + cp->cp_class = htons(FC_CPC_VALID | FC_CPC_SEQ); + csp->sp_features = htons(FC_SP_FT_CIRO); + csp->sp_tot_seq = htons(255); /* seq. we accept */ + csp->sp_rel_off = htons(0x1f); + csp->sp_e_d_tov = htonl(lport->e_d_tov); + + cp->cp_rdfs = htons((u16) lport->mfs); + cp->cp_con_seq = htons(255); + cp->cp_open_seq = 1; +} + +/** + * fc_flogi_fill - Fill in a flogi request frame. + */ +static inline void fc_flogi_fill(struct fc_lport *lport, struct fc_frame *fp) +{ + struct fc_els_csp *sp; + struct fc_els_cssp *cp; + struct fc_els_flogi *flogi; + + flogi = fc_frame_payload_get(fp, sizeof(*flogi)); + memset(flogi, 0, sizeof(*flogi)); + flogi->fl_cmd = (u8) ELS_FLOGI; + put_unaligned_be64(lport->wwpn, &flogi->fl_wwpn); + put_unaligned_be64(lport->wwnn, &flogi->fl_wwnn); + sp = &flogi->fl_csp; + sp->sp_hi_ver = 0x20; + sp->sp_lo_ver = 0x20; + sp->sp_bb_cred = htons(10); /* this gets set by gateway */ + sp->sp_bb_data = htons((u16) lport->mfs); + cp = &flogi->fl_cssp[3 - 1]; /* class 3 parameters */ + cp->cp_class = htons(FC_CPC_VALID | FC_CPC_SEQ); +} + +/** + * fc_logo_fill - Fill in a logo request frame. + */ +static inline void fc_logo_fill(struct fc_lport *lport, struct fc_frame *fp) +{ + struct fc_els_logo *logo; + + logo = fc_frame_payload_get(fp, sizeof(*logo)); + memset(logo, 0, sizeof(*logo)); + logo->fl_cmd = ELS_LOGO; + hton24(logo->fl_n_port_id, fc_host_port_id(lport->host)); + logo->fl_n_port_wwn = htonll(lport->wwpn); +} + +/** + * fc_rtv_fill - Fill in RTV (read timeout value) request frame. + */ +static inline void fc_rtv_fill(struct fc_lport *lport, struct fc_frame *fp) +{ + struct fc_els_rtv *rtv; + + rtv = fc_frame_payload_get(fp, sizeof(*rtv)); + memset(rtv, 0, sizeof(*rtv)); + rtv->rtv_cmd = ELS_RTV; +} + +/** + * fc_rec_fill - Fill in rec request frame + */ +static inline void fc_rec_fill(struct fc_lport *lport, struct fc_frame *fp) +{ + struct fc_els_rec *rec; + struct fc_exch *ep = fc_seq_exch(fr_seq(fp)); + + rec = fc_frame_payload_get(fp, sizeof(*rec)); + memset(rec, 0, sizeof(*rec)); + rec->rec_cmd = ELS_REC; + hton24(rec->rec_s_id, fc_host_port_id(lport->host)); + rec->rec_ox_id = htons(ep->oxid); + rec->rec_rx_id = htons(ep->rxid); +} + +/** + * fc_prli_fill - Fill in prli request frame + */ +static inline void fc_prli_fill(struct fc_lport *lport, struct fc_frame *fp) +{ + struct { + struct fc_els_prli prli; + struct fc_els_spp spp; + } *pp; + + pp = fc_frame_payload_get(fp, sizeof(*pp)); + memset(pp, 0, sizeof(*pp)); + pp->prli.prli_cmd = ELS_PRLI; + pp->prli.prli_spp_len = sizeof(struct fc_els_spp); + pp->prli.prli_len = htons(sizeof(*pp)); + pp->spp.spp_type = FC_TYPE_FCP; + pp->spp.spp_flags = FC_SPP_EST_IMG_PAIR; + pp->spp.spp_params = htonl(lport->service_params); +} + +/** + * fc_scr_fill - Fill in a scr request frame. + */ +static inline void fc_scr_fill(struct fc_lport *lport, struct fc_frame *fp) +{ + struct fc_els_scr *scr; + + scr = fc_frame_payload_get(fp, sizeof(*scr)); + memset(scr, 0, sizeof(*scr)); + scr->scr_cmd = ELS_SCR; + scr->scr_reg_func = ELS_SCRF_FULL; +} + +/** + * fc_els_fill - Fill in an ELS request frame + */ +static inline int fc_els_fill(struct fc_lport *lport, struct fc_rport *rport, + struct fc_frame *fp, unsigned int op, + enum fc_rctl *r_ctl, u32 *did, enum fc_fh_type *fh_type) +{ + switch (op) { + case ELS_PLOGI: + fc_plogi_fill(lport, fp, ELS_PLOGI); + *did = rport->port_id; + break; + + case ELS_FLOGI: + fc_flogi_fill(lport, fp); + *did = FC_FID_FLOGI; + break; + + case ELS_LOGO: + fc_logo_fill(lport, fp); + *did = FC_FID_FLOGI; + /* + * if rport is valid then it + * is port logo, therefore + * set did to rport id. + */ + if (rport) + *did = rport->port_id; + break; + + case ELS_RTV: + fc_rtv_fill(lport, fp); + *did = rport->port_id; + break; + + case ELS_REC: + fc_rec_fill(lport, fp); + *did = rport->port_id; + break; + + case ELS_PRLI: + fc_prli_fill(lport, fp); + *did = rport->port_id; + break; + + case ELS_SCR: + fc_scr_fill(lport, fp); + *did = FC_FID_FCTRL; + break; + + default: + FC_DBG("Invalid op code %x \n", op); + return -EINVAL; + } + + *r_ctl = FC_RCTL_ELS_REQ; + *fh_type = FC_TYPE_ELS; + return 0; +} +#endif /* _FC_ENCODE_H_ */ diff --git a/include/scsi/fc_frame.h b/include/scsi/fc_frame.h new file mode 100644 index 00000000000..04d34a71355 --- /dev/null +++ b/include/scsi/fc_frame.h @@ -0,0 +1,242 @@ +/* + * Copyright(c) 2007 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + * + * Maintained at www.Open-FCoE.org + */ + +#ifndef _FC_FRAME_H_ +#define _FC_FRAME_H_ + +#include +#include +#include + +#include +#include +#include + +/* + * The fc_frame interface is used to pass frame data between functions. + * The frame includes the data buffer, length, and SOF / EOF delimiter types. + * A pointer to the port structure of the receiving port is also includeded. + */ + +#define FC_FRAME_HEADROOM 32 /* headroom for VLAN + FCoE headers */ +#define FC_FRAME_TAILROOM 8 /* trailer space for FCoE */ + +/* + * Information about an individual fibre channel frame received or to be sent. + * The buffer may be in up to 4 additional non-contiguous sections, + * but the linear section must hold the frame header. + */ +#define FC_FRAME_SG_LEN 4 /* scatter/gather list maximum length */ + +#define fp_skb(fp) (&((fp)->skb)) +#define fr_hdr(fp) ((fp)->skb.data) +#define fr_len(fp) ((fp)->skb.len) +#define fr_cb(fp) ((struct fcoe_rcv_info *)&((fp)->skb.cb[0])) +#define fr_dev(fp) (fr_cb(fp)->fr_dev) +#define fr_seq(fp) (fr_cb(fp)->fr_seq) +#define fr_sof(fp) (fr_cb(fp)->fr_sof) +#define fr_eof(fp) (fr_cb(fp)->fr_eof) +#define fr_flags(fp) (fr_cb(fp)->fr_flags) +#define fr_max_payload(fp) (fr_cb(fp)->fr_max_payload) +#define fr_cmd(fp) (fr_cb(fp)->fr_cmd) +#define fr_dir(fp) (fr_cmd(fp)->sc_data_direction) +#define fr_crc(fp) (fr_cb(fp)->fr_crc) + +struct fc_frame { + struct sk_buff skb; +}; + +struct fcoe_rcv_info { + struct packet_type *ptype; + struct fc_lport *fr_dev; /* transport layer private pointer */ + struct fc_seq *fr_seq; /* for use with exchange manager */ + struct scsi_cmnd *fr_cmd; /* for use of scsi command */ + u32 fr_crc; + u16 fr_max_payload; /* max FC payload */ + enum fc_sof fr_sof; /* start of frame delimiter */ + enum fc_eof fr_eof; /* end of frame delimiter */ + u8 fr_flags; /* flags - see below */ +}; + + +/* + * Get fc_frame pointer for an skb that's already been imported. + */ +static inline struct fcoe_rcv_info *fcoe_dev_from_skb(const struct sk_buff *skb) +{ + BUILD_BUG_ON(sizeof(struct fcoe_rcv_info) > sizeof(skb->cb)); + return (struct fcoe_rcv_info *) skb->cb; +} + +/* + * fr_flags. + */ +#define FCPHF_CRC_UNCHECKED 0x01 /* CRC not computed, still appended */ + +/* + * Initialize a frame. + * We don't do a complete memset here for performance reasons. + * The caller must set fr_free, fr_hdr, fr_len, fr_sof, and fr_eof eventually. + */ +static inline void fc_frame_init(struct fc_frame *fp) +{ + fr_dev(fp) = NULL; + fr_seq(fp) = NULL; + fr_flags(fp) = 0; +} + +struct fc_frame *fc_frame_alloc_fill(struct fc_lport *, size_t payload_len); + +struct fc_frame *__fc_frame_alloc(size_t payload_len); + +/* + * Get frame for sending via port. + */ +static inline struct fc_frame *_fc_frame_alloc(struct fc_lport *dev, + size_t payload_len) +{ + return __fc_frame_alloc(payload_len); +} + +/* + * Allocate fc_frame structure and buffer. Set the initial length to + * payload_size + sizeof (struct fc_frame_header). + */ +static inline struct fc_frame *fc_frame_alloc(struct fc_lport *dev, size_t len) +{ + struct fc_frame *fp; + + /* + * Note: Since len will often be a constant multiple of 4, + * this check will usually be evaluated and eliminated at compile time. + */ + if ((len % 4) != 0) + fp = fc_frame_alloc_fill(dev, len); + else + fp = _fc_frame_alloc(dev, len); + return fp; +} + +/* + * Free the fc_frame structure and buffer. + */ +static inline void fc_frame_free(struct fc_frame *fp) +{ + kfree_skb(fp_skb(fp)); +} + +static inline int fc_frame_is_linear(struct fc_frame *fp) +{ + return !skb_is_nonlinear(fp_skb(fp)); +} + +/* + * Get frame header from message in fc_frame structure. + * This hides a cast and provides a place to add some checking. + */ +static inline +struct fc_frame_header *fc_frame_header_get(const struct fc_frame *fp) +{ + WARN_ON(fr_len(fp) < sizeof(struct fc_frame_header)); + return (struct fc_frame_header *) fr_hdr(fp); +} + +/* + * Get frame payload from message in fc_frame structure. + * This hides a cast and provides a place to add some checking. + * The len parameter is the minimum length for the payload portion. + * Returns NULL if the frame is too short. + * + * This assumes the interesting part of the payload is in the first part + * of the buffer for received data. This may not be appropriate to use for + * buffers being transmitted. + */ +static inline void *fc_frame_payload_get(const struct fc_frame *fp, + size_t len) +{ + void *pp = NULL; + + if (fr_len(fp) >= sizeof(struct fc_frame_header) + len) + pp = fc_frame_header_get(fp) + 1; + return pp; +} + +/* + * Get frame payload opcode (first byte) from message in fc_frame structure. + * This hides a cast and provides a place to add some checking. Return 0 + * if the frame has no payload. + */ +static inline u8 fc_frame_payload_op(const struct fc_frame *fp) +{ + u8 *cp; + + cp = fc_frame_payload_get(fp, sizeof(u8)); + if (!cp) + return 0; + return *cp; + +} + +/* + * Get FC class from frame. + */ +static inline enum fc_class fc_frame_class(const struct fc_frame *fp) +{ + return fc_sof_class(fr_sof(fp)); +} + +/* + * Check the CRC in a frame. + * The CRC immediately follows the last data item *AFTER* the length. + * The return value is zero if the CRC matches. + */ +u32 fc_frame_crc_check(struct fc_frame *); + +static inline u8 fc_frame_rctl(const struct fc_frame *fp) +{ + return fc_frame_header_get(fp)->fh_r_ctl; +} + +static inline bool fc_frame_is_cmd(const struct fc_frame *fp) +{ + return fc_frame_rctl(fp) == FC_RCTL_DD_UNSOL_CMD; +} + +static inline bool fc_frame_is_read(const struct fc_frame *fp) +{ + if (fc_frame_is_cmd(fp) && fr_cmd(fp)) + return fr_dir(fp) == DMA_FROM_DEVICE; + return false; +} + +static inline bool fc_frame_is_write(const struct fc_frame *fp) +{ + if (fc_frame_is_cmd(fp) && fr_cmd(fp)) + return fr_dir(fp) == DMA_TO_DEVICE; + return false; +} + +/* + * Check for leaks. + * Print the frame header of any currently allocated frame, assuming there + * should be none at this point. + */ +void fc_frame_leak_check(void); + +#endif /* _FC_FRAME_H_ */ diff --git a/include/scsi/libfc.h b/include/scsi/libfc.h new file mode 100644 index 00000000000..9f2876397dd --- /dev/null +++ b/include/scsi/libfc.h @@ -0,0 +1,938 @@ +/* + * Copyright(c) 2007 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + * + * Maintained at www.Open-FCoE.org + */ + +#ifndef _LIBFC_H_ +#define _LIBFC_H_ + +#include +#include + +#include +#include + +#include +#include +#include +#include + +#include + +#define LIBFC_DEBUG + +#ifdef LIBFC_DEBUG +/* Log messages */ +#define FC_DBG(fmt, args...) \ + do { \ + printk(KERN_INFO "%s " fmt, __func__, ##args); \ + } while (0) +#else +#define FC_DBG(fmt, args...) +#endif + +/* + * libfc error codes + */ +#define FC_NO_ERR 0 /* no error */ +#define FC_EX_TIMEOUT 1 /* Exchange timeout */ +#define FC_EX_CLOSED 2 /* Exchange closed */ + +/* some helpful macros */ + +#define ntohll(x) be64_to_cpu(x) +#define htonll(x) cpu_to_be64(x) + +#define ntoh24(p) (((p)[0] << 16) | ((p)[1] << 8) | ((p)[2])) + +#define hton24(p, v) do { \ + p[0] = (((v) >> 16) & 0xFF); \ + p[1] = (((v) >> 8) & 0xFF); \ + p[2] = ((v) & 0xFF); \ + } while (0) + +/* + * FC HBA status + */ +#define FC_PAUSE (1 << 1) +#define FC_LINK_UP (1 << 0) + +enum fc_lport_state { + LPORT_ST_NONE = 0, + LPORT_ST_FLOGI, + LPORT_ST_DNS, + LPORT_ST_RPN_ID, + LPORT_ST_RFT_ID, + LPORT_ST_SCR, + LPORT_ST_READY, + LPORT_ST_LOGO, + LPORT_ST_RESET +}; + +enum fc_disc_event { + DISC_EV_NONE = 0, + DISC_EV_SUCCESS, + DISC_EV_FAILED +}; + +enum fc_rport_state { + RPORT_ST_NONE = 0, + RPORT_ST_INIT, /* initialized */ + RPORT_ST_PLOGI, /* waiting for PLOGI completion */ + RPORT_ST_PRLI, /* waiting for PRLI completion */ + RPORT_ST_RTV, /* waiting for RTV completion */ + RPORT_ST_READY, /* ready for use */ + RPORT_ST_LOGO, /* port logout sent */ +}; + +enum fc_rport_trans_state { + FC_PORTSTATE_ROGUE, + FC_PORTSTATE_REAL, +}; + +/** + * struct fc_disc_port - temporary discovery port to hold rport identifiers + * @lp: Fibre Channel host port instance + * @peers: node for list management during discovery and RSCN processing + * @ids: identifiers structure to pass to fc_remote_port_add() + * @rport_work: work struct for starting the rport state machine + */ +struct fc_disc_port { + struct fc_lport *lp; + struct list_head peers; + struct fc_rport_identifiers ids; + struct work_struct rport_work; +}; + +enum fc_rport_event { + RPORT_EV_NONE = 0, + RPORT_EV_CREATED, + RPORT_EV_FAILED, + RPORT_EV_STOP, + RPORT_EV_LOGO +}; + +struct fc_rport_operations { + void (*event_callback)(struct fc_lport *, struct fc_rport *, + enum fc_rport_event); +}; + +/** + * struct fc_rport_libfc_priv - libfc internal information about a remote port + * @local_port: Fibre Channel host port instance + * @rp_state: state tracks progress of PLOGI, PRLI, and RTV exchanges + * @flags: REC and RETRY supported flags + * @max_seq: maximum number of concurrent sequences + * @retries: retry count in current state + * @e_d_tov: error detect timeout value (in msec) + * @r_a_tov: resource allocation timeout value (in msec) + * @rp_mutex: mutex protects rport + * @retry_work: + * @event_callback: Callback for rport READY, FAILED or LOGO + */ +struct fc_rport_libfc_priv { + struct fc_lport *local_port; + enum fc_rport_state rp_state; + u16 flags; + #define FC_RP_FLAGS_REC_SUPPORTED (1 << 0) + #define FC_RP_FLAGS_RETRY (1 << 1) + u16 max_seq; + unsigned int retries; + unsigned int e_d_tov; + unsigned int r_a_tov; + enum fc_rport_trans_state trans_state; + struct mutex rp_mutex; + struct delayed_work retry_work; + enum fc_rport_event event; + struct fc_rport_operations *ops; + struct list_head peers; + struct work_struct event_work; +}; + +#define PRIV_TO_RPORT(x) \ + (struct fc_rport *)((void *)x - sizeof(struct fc_rport)); +#define RPORT_TO_PRIV(x) \ + (struct fc_rport_libfc_priv *)((void *)x + sizeof(struct fc_rport)); + +struct fc_rport *fc_rport_rogue_create(struct fc_disc_port *); + +static inline void fc_rport_set_name(struct fc_rport *rport, u64 wwpn, u64 wwnn) +{ + rport->node_name = wwnn; + rport->port_name = wwpn; +} + +/* + * fcoe stats structure + */ +struct fcoe_dev_stats { + u64 SecondsSinceLastReset; + u64 TxFrames; + u64 TxWords; + u64 RxFrames; + u64 RxWords; + u64 ErrorFrames; + u64 DumpedFrames; + u64 LinkFailureCount; + u64 LossOfSignalCount; + u64 InvalidTxWordCount; + u64 InvalidCRCCount; + u64 InputRequests; + u64 OutputRequests; + u64 ControlRequests; + u64 InputMegabytes; + u64 OutputMegabytes; +}; + +/* + * els data is used for passing ELS respone specific + * data to send ELS response mainly using infomation + * in exchange and sequence in EM layer. + */ +struct fc_seq_els_data { + struct fc_frame *fp; + enum fc_els_rjt_reason reason; + enum fc_els_rjt_explan explan; +}; + +/* + * FCP request structure, one for each scsi cmd request + */ +struct fc_fcp_pkt { + /* + * housekeeping stuff + */ + struct fc_lport *lp; /* handle to hba struct */ + u16 state; /* scsi_pkt state state */ + u16 tgt_flags; /* target flags */ + atomic_t ref_cnt; /* fcp pkt ref count */ + spinlock_t scsi_pkt_lock; /* Must be taken before the host lock + * if both are held at the same time */ + /* + * SCSI I/O related stuff + */ + struct scsi_cmnd *cmd; /* scsi command pointer. set/clear + * under host lock */ + struct list_head list; /* tracks queued commands. access under + * host lock */ + /* + * timeout related stuff + */ + struct timer_list timer; /* command timer */ + struct completion tm_done; + int wait_for_comp; + unsigned long start_time; /* start jiffie */ + unsigned long end_time; /* end jiffie */ + unsigned long last_pkt_time; /* jiffies of last frame received */ + + /* + * scsi cmd and data transfer information + */ + u32 data_len; + /* + * transport related veriables + */ + struct fcp_cmnd cdb_cmd; + size_t xfer_len; + u32 xfer_contig_end; /* offset of end of contiguous xfer */ + u16 max_payload; /* max payload size in bytes */ + + /* + * scsi/fcp return status + */ + u32 io_status; /* SCSI result upper 24 bits */ + u8 cdb_status; + u8 status_code; /* FCP I/O status */ + /* bit 3 Underrun bit 2: overrun */ + u8 scsi_comp_flags; + u32 req_flags; /* bit 0: read bit:1 write */ + u32 scsi_resid; /* residule length */ + + struct fc_rport *rport; /* remote port pointer */ + struct fc_seq *seq_ptr; /* current sequence pointer */ + /* + * Error Processing + */ + u8 recov_retry; /* count of recovery retries */ + struct fc_seq *recov_seq; /* sequence for REC or SRR */ +}; + +/* + * Structure and function definitions for managing Fibre Channel Exchanges + * and Sequences + * + * fc_exch holds state for one exchange and links to its active sequence. + * + * fc_seq holds the state for an individual sequence. + */ + +struct fc_exch_mgr; + +/* + * Sequence. + */ +struct fc_seq { + u8 id; /* seq ID */ + u16 ssb_stat; /* status flags for sequence status block */ + u16 cnt; /* frames sent so far on sequence */ + u32 rec_data; /* FC-4 value for REC */ +}; + +#define FC_EX_DONE (1 << 0) /* ep is completed */ +#define FC_EX_RST_CLEANUP (1 << 1) /* reset is forcing completion */ + +/* + * Exchange. + * + * Locking notes: The ex_lock protects following items: + * state, esb_stat, f_ctl, seq.ssb_stat + * seq_id + * sequence allocation + */ +struct fc_exch { + struct fc_exch_mgr *em; /* exchange manager */ + u32 state; /* internal driver state */ + u16 xid; /* our exchange ID */ + struct list_head ex_list; /* free or busy list linkage */ + spinlock_t ex_lock; /* lock covering exchange state */ + atomic_t ex_refcnt; /* reference counter */ + struct delayed_work timeout_work; /* timer for upper level protocols */ + struct fc_lport *lp; /* fc device instance */ + u16 oxid; /* originator's exchange ID */ + u16 rxid; /* responder's exchange ID */ + u32 oid; /* originator's FCID */ + u32 sid; /* source FCID */ + u32 did; /* destination FCID */ + u32 esb_stat; /* exchange status for ESB */ + u32 r_a_tov; /* r_a_tov from rport (msec) */ + u8 seq_id; /* next sequence ID to use */ + u32 f_ctl; /* F_CTL flags for sequences */ + u8 fh_type; /* frame type */ + enum fc_class class; /* class of service */ + struct fc_seq seq; /* single sequence */ + /* + * Handler for responses to this current exchange. + */ + void (*resp)(struct fc_seq *, struct fc_frame *, void *); + void (*destructor)(struct fc_seq *, void *); + /* + * arg is passed as void pointer to exchange + * resp and destructor handlers + */ + void *arg; +}; +#define fc_seq_exch(sp) container_of(sp, struct fc_exch, seq) + +struct libfc_function_template { + + /** + * Mandatory Fields + * + * These handlers must be implemented by the LLD. + */ + + /* + * Interface to send a FC frame + */ + int (*frame_send)(struct fc_lport *lp, struct fc_frame *fp); + + /** + * Optional Fields + * + * The LLD may choose to implement any of the following handlers. + * If LLD doesn't specify hander and leaves its pointer NULL then + * the default libfc function will be used for that handler. + */ + + /** + * ELS/CT interfaces + */ + + /* + * elsct_send - sends ELS/CT frame + */ + struct fc_seq *(*elsct_send)(struct fc_lport *lport, + struct fc_rport *rport, + struct fc_frame *fp, + unsigned int op, + void (*resp)(struct fc_seq *, + struct fc_frame *fp, + void *arg), + void *arg, u32 timer_msec); + /** + * Exhance Manager interfaces + */ + + /* + * Send the FC frame payload using a new exchange and sequence. + * + * The frame pointer with some of the header's fields must be + * filled before calling exch_seq_send(), those fields are, + * + * - routing control + * - FC port did + * - FC port sid + * - FC header type + * - frame control + * - parameter or relative offset + * + * The exchange response handler is set in this routine to resp() + * function pointer. It can be called in two scenarios: if a timeout + * occurs or if a response frame is received for the exchange. The + * fc_frame pointer in response handler will also indicate timeout + * as error using IS_ERR related macros. + * + * The exchange destructor handler is also set in this routine. + * The destructor handler is invoked by EM layer when exchange + * is about to free, this can be used by caller to free its + * resources along with exchange free. + * + * The arg is passed back to resp and destructor handler. + * + * The timeout value (in msec) for an exchange is set if non zero + * timer_msec argument is specified. The timer is canceled when + * it fires or when the exchange is done. The exchange timeout handler + * is registered by EM layer. + */ + struct fc_seq *(*exch_seq_send)(struct fc_lport *lp, + struct fc_frame *fp, + void (*resp)(struct fc_seq *sp, + struct fc_frame *fp, + void *arg), + void (*destructor)(struct fc_seq *sp, + void *arg), + void *arg, unsigned int timer_msec); + + /* + * send a frame using existing sequence and exchange. + */ + int (*seq_send)(struct fc_lport *lp, struct fc_seq *sp, + struct fc_frame *fp); + + /* + * Send ELS response using mainly infomation + * in exchange and sequence in EM layer. + */ + void (*seq_els_rsp_send)(struct fc_seq *sp, enum fc_els_cmd els_cmd, + struct fc_seq_els_data *els_data); + + /* + * Abort an exchange and sequence. Generally called because of a + * exchange timeout or an abort from the upper layer. + * + * A timer_msec can be specified for abort timeout, if non-zero + * timer_msec value is specified then exchange resp handler + * will be called with timeout error if no response to abort. + */ + int (*seq_exch_abort)(const struct fc_seq *req_sp, + unsigned int timer_msec); + + /* + * Indicate that an exchange/sequence tuple is complete and the memory + * allocated for the related objects may be freed. + */ + void (*exch_done)(struct fc_seq *sp); + + /* + * Assigns a EM and a free XID for an new exchange and then + * allocates a new exchange and sequence pair. + * The fp can be used to determine free XID. + */ + struct fc_exch *(*exch_get)(struct fc_lport *lp, struct fc_frame *fp); + + /* + * Release previously assigned XID by exch_get API. + * The LLD may implement this if XID is assigned by LLD + * in exch_get(). + */ + void (*exch_put)(struct fc_lport *lp, struct fc_exch_mgr *mp, + u16 ex_id); + + /* + * Start a new sequence on the same exchange/sequence tuple. + */ + struct fc_seq *(*seq_start_next)(struct fc_seq *sp); + + /* + * Reset an exchange manager, completing all sequences and exchanges. + * If s_id is non-zero, reset only exchanges originating from that FID. + * If d_id is non-zero, reset only exchanges sending to that FID. + */ + void (*exch_mgr_reset)(struct fc_exch_mgr *, + u32 s_id, u32 d_id); + + void (*rport_flush_queue)(void); + /** + * Local Port interfaces + */ + + /* + * Receive a frame to a local port. + */ + void (*lport_recv)(struct fc_lport *lp, struct fc_seq *sp, + struct fc_frame *fp); + + int (*lport_reset)(struct fc_lport *); + + /** + * Remote Port interfaces + */ + + /* + * Initiates the RP state machine. It is called from the LP module. + * This function will issue the following commands to the N_Port + * identified by the FC ID provided. + * + * - PLOGI + * - PRLI + * - RTV + */ + int (*rport_login)(struct fc_rport *rport); + + /* + * Logoff, and remove the rport from the transport if + * it had been added. This will send a LOGO to the target. + */ + int (*rport_logoff)(struct fc_rport *rport); + + /* + * Recieve a request from a remote port. + */ + void (*rport_recv_req)(struct fc_seq *, struct fc_frame *, + struct fc_rport *); + + struct fc_rport *(*rport_lookup)(const struct fc_lport *, u32); + + /** + * FCP interfaces + */ + + /* + * Send a fcp cmd from fsp pkt. + * Called with the SCSI host lock unlocked and irqs disabled. + * + * The resp handler is called when FCP_RSP received. + * + */ + int (*fcp_cmd_send)(struct fc_lport *lp, struct fc_fcp_pkt *fsp, + void (*resp)(struct fc_seq *, struct fc_frame *fp, + void *arg)); + + /* + * Used at least durring linkdown and reset + */ + void (*fcp_cleanup)(struct fc_lport *lp); + + /* + * Abort all I/O on a local port + */ + void (*fcp_abort_io)(struct fc_lport *lp); + + /** + * Discovery interfaces + */ + + void (*disc_recv_req)(struct fc_seq *, + struct fc_frame *, struct fc_lport *); + + /* + * Start discovery for a local port. + */ + void (*disc_start)(void (*disc_callback)(struct fc_lport *, + enum fc_disc_event), + struct fc_lport *); + + /* + * Stop discovery for a given lport. This will remove + * all discovered rports + */ + void (*disc_stop) (struct fc_lport *); + + /* + * Stop discovery for a given lport. This will block + * until all discovered rports are deleted from the + * FC transport class + */ + void (*disc_stop_final) (struct fc_lport *); +}; + +/* information used by the discovery layer */ +struct fc_disc { + unsigned char retry_count; + unsigned char delay; + unsigned char pending; + unsigned char requested; + unsigned short seq_count; + unsigned char buf_len; + enum fc_disc_event event; + + void (*disc_callback)(struct fc_lport *, + enum fc_disc_event); + + struct list_head rports; + struct fc_lport *lport; + struct mutex disc_mutex; + struct fc_gpn_ft_resp partial_buf; /* partial name buffer */ + struct delayed_work disc_work; +}; + +struct fc_lport { + struct list_head list; + + /* Associations */ + struct Scsi_Host *host; + struct fc_exch_mgr *emp; + struct fc_rport *dns_rp; + struct fc_rport *ptp_rp; + void *scsi_priv; + struct fc_disc disc; + + /* Operational Information */ + struct libfc_function_template tt; + u16 link_status; + enum fc_lport_state state; + unsigned long boot_time; + + struct fc_host_statistics host_stats; + struct fcoe_dev_stats *dev_stats[NR_CPUS]; + u64 wwpn; + u64 wwnn; + u8 retry_count; + + /* Capabilities */ + u32 sg_supp:1; /* scatter gather supported */ + u32 seq_offload:1; /* seq offload supported */ + u32 crc_offload:1; /* crc offload supported */ + u32 lro_enabled:1; /* large receive offload */ + u32 mfs; /* max FC payload size */ + unsigned int service_params; + unsigned int e_d_tov; + unsigned int r_a_tov; + u8 max_retry_count; + u16 link_speed; + u16 link_supported_speeds; + u16 lro_xid; /* max xid for fcoe lro */ + struct fc_ns_fts fcts; /* FC-4 type masks */ + struct fc_els_rnid_gen rnid_gen; /* RNID information */ + + /* Semaphores */ + struct mutex lp_mutex; + + /* Miscellaneous */ + struct delayed_work retry_work; + struct delayed_work disc_work; +}; + +/** + * FC_LPORT HELPER FUNCTIONS + *****************************/ +static inline void *lport_priv(const struct fc_lport *lp) +{ + return (void *)(lp + 1); +} + +static inline int fc_lport_test_ready(struct fc_lport *lp) +{ + return lp->state == LPORT_ST_READY; +} + +static inline void fc_set_wwnn(struct fc_lport *lp, u64 wwnn) +{ + lp->wwnn = wwnn; +} + +static inline void fc_set_wwpn(struct fc_lport *lp, u64 wwnn) +{ + lp->wwpn = wwnn; +} + +static inline void fc_lport_state_enter(struct fc_lport *lp, + enum fc_lport_state state) +{ + if (state != lp->state) + lp->retry_count = 0; + lp->state = state; +} + + +/** + * LOCAL PORT LAYER + *****************************/ +int fc_lport_init(struct fc_lport *lp); + +/* + * Destroy the specified local port by finding and freeing all + * fc_rports associated with it and then by freeing the fc_lport + * itself. + */ +int fc_lport_destroy(struct fc_lport *lp); + +/* + * Logout the specified local port from the fabric + */ +int fc_fabric_logoff(struct fc_lport *lp); + +/* + * Initiate the LP state machine. This handler will use fc_host_attr + * to store the FLOGI service parameters, so fc_host_attr must be + * initialized before calling this handler. + */ +int fc_fabric_login(struct fc_lport *lp); + +/* + * The link is up for the given local port. + */ +void fc_linkup(struct fc_lport *); + +/* + * Link is down for the given local port. + */ +void fc_linkdown(struct fc_lport *); + +/* + * Pause and unpause traffic. + */ +void fc_pause(struct fc_lport *); +void fc_unpause(struct fc_lport *); + +/* + * Configure the local port. + */ +int fc_lport_config(struct fc_lport *); + +/* + * Reset the local port. + */ +int fc_lport_reset(struct fc_lport *); + +/* + * Set the mfs or reset + */ +int fc_set_mfs(struct fc_lport *lp, u32 mfs); + + +/** + * REMOTE PORT LAYER + *****************************/ +int fc_rport_init(struct fc_lport *lp); +void fc_rport_terminate_io(struct fc_rport *rp); + +/** + * DISCOVERY LAYER + *****************************/ +int fc_disc_init(struct fc_lport *lp); + + +/** + * SCSI LAYER + *****************************/ +/* + * Initialize the SCSI block of libfc + */ +int fc_fcp_init(struct fc_lport *); + +/* + * This section provides an API which allows direct interaction + * with the SCSI-ml. Each of these functions satisfies a function + * pointer defined in Scsi_Host and therefore is always called + * directly from the SCSI-ml. + */ +int fc_queuecommand(struct scsi_cmnd *sc_cmd, + void (*done)(struct scsi_cmnd *)); + +/* + * complete processing of a fcp packet + * + * This function may sleep if a fsp timer is pending. + * The host lock must not be held by caller. + */ +void fc_fcp_complete(struct fc_fcp_pkt *fsp); + +/* + * Send an ABTS frame to the target device. The sc_cmd argument + * is a pointer to the SCSI command to be aborted. + */ +int fc_eh_abort(struct scsi_cmnd *sc_cmd); + +/* + * Reset a LUN by sending send the tm cmd to the target. + */ +int fc_eh_device_reset(struct scsi_cmnd *sc_cmd); + +/* + * Reset the host adapter. + */ +int fc_eh_host_reset(struct scsi_cmnd *sc_cmd); + +/* + * Check rport status. + */ +int fc_slave_alloc(struct scsi_device *sdev); + +/* + * Adjust the queue depth. + */ +int fc_change_queue_depth(struct scsi_device *sdev, int qdepth); + +/* + * Change the tag type. + */ +int fc_change_queue_type(struct scsi_device *sdev, int tag_type); + +/* + * Free memory pools used by the FCP layer. + */ +void fc_fcp_destroy(struct fc_lport *); + +/** + * ELS/CT interface + *****************************/ +/* + * Initializes ELS/CT interface + */ +int fc_elsct_init(struct fc_lport *lp); + + +/** + * EXCHANGE MANAGER LAYER + *****************************/ +/* + * Initializes Exchange Manager related + * function pointers in struct libfc_function_template. + */ +int fc_exch_init(struct fc_lport *lp); + +/* + * Allocates an Exchange Manager (EM). + * + * The EM manages exchanges for their allocation and + * free, also allows exchange lookup for received + * frame. + * + * The class is used for initializing FC class of + * allocated exchange from EM. + * + * The min_xid and max_xid will limit new + * exchange ID (XID) within this range for + * a new exchange. + * The LLD may choose to have multiple EMs, + * e.g. one EM instance per CPU receive thread in LLD. + * The LLD can use exch_get() of struct libfc_function_template + * to specify XID for a new exchange within + * a specified EM instance. + * + * The em_idx to uniquely identify an EM instance. + */ +struct fc_exch_mgr *fc_exch_mgr_alloc(struct fc_lport *lp, + enum fc_class class, + u16 min_xid, + u16 max_xid); + +/* + * Free an exchange manager. + */ +void fc_exch_mgr_free(struct fc_exch_mgr *mp); + +/* + * Receive a frame on specified local port and exchange manager. + */ +void fc_exch_recv(struct fc_lport *lp, struct fc_exch_mgr *mp, + struct fc_frame *fp); + +/* + * This function is for exch_seq_send function pointer in + * struct libfc_function_template, see comment block on + * exch_seq_send for description of this function. + */ +struct fc_seq *fc_exch_seq_send(struct fc_lport *lp, + struct fc_frame *fp, + void (*resp)(struct fc_seq *sp, + struct fc_frame *fp, + void *arg), + void (*destructor)(struct fc_seq *sp, + void *arg), + void *arg, u32 timer_msec); + +/* + * send a frame using existing sequence and exchange. + */ +int fc_seq_send(struct fc_lport *lp, struct fc_seq *sp, struct fc_frame *fp); + +/* + * Send ELS response using mainly infomation + * in exchange and sequence in EM layer. + */ +void fc_seq_els_rsp_send(struct fc_seq *sp, enum fc_els_cmd els_cmd, + struct fc_seq_els_data *els_data); + +/* + * This function is for seq_exch_abort function pointer in + * struct libfc_function_template, see comment block on + * seq_exch_abort for description of this function. + */ +int fc_seq_exch_abort(const struct fc_seq *req_sp, unsigned int timer_msec); + +/* + * Indicate that an exchange/sequence tuple is complete and the memory + * allocated for the related objects may be freed. + */ +void fc_exch_done(struct fc_seq *sp); + +/* + * Assigns a EM and XID for a frame and then allocates + * a new exchange and sequence pair. + * The fp can be used to determine free XID. + */ +struct fc_exch *fc_exch_get(struct fc_lport *lp, struct fc_frame *fp); + +/* + * Allocate a new exchange and sequence pair. + * if ex_id is zero then next free exchange id + * from specified exchange manger mp will be assigned. + */ +struct fc_exch *fc_exch_alloc(struct fc_exch_mgr *mp, + struct fc_frame *fp, u16 ex_id); +/* + * Start a new sequence on the same exchange as the supplied sequence. + */ +struct fc_seq *fc_seq_start_next(struct fc_seq *sp); + +/* + * Reset an exchange manager, completing all sequences and exchanges. + * If s_id is non-zero, reset only exchanges originating from that FID. + * If d_id is non-zero, reset only exchanges sending to that FID. + */ +void fc_exch_mgr_reset(struct fc_exch_mgr *, u32 s_id, u32 d_id); + +/* + * Functions for fc_functions_template + */ +void fc_get_host_speed(struct Scsi_Host *shost); +void fc_get_host_port_type(struct Scsi_Host *shost); +void fc_get_host_port_state(struct Scsi_Host *shost); +void fc_set_rport_loss_tmo(struct fc_rport *rport, u32 timeout); +struct fc_host_statistics *fc_get_host_stats(struct Scsi_Host *); + +/* + * module setup functions. + */ +int fc_setup_exch_mgr(void); +void fc_destroy_exch_mgr(void); +int fc_setup_rport(void); +void fc_destroy_rport(void); + +#endif /* _LIBFC_H_ */ -- cgit v1.2.3-70-g09d2 From 85b4aa4926a50210b683ac89326e338e7d131211 Mon Sep 17 00:00:00 2001 From: Robert Love Date: Tue, 9 Dec 2008 15:10:24 -0800 Subject: [SCSI] fcoe: Fibre Channel over Ethernet Encapsulation protocol for running Fibre Channel over Ethernet interfaces. Creates virtual Fibre Channel host adapters using libfc. This layer is the LLD to the scsi-ml. It allocates the Scsi_Host, utilizes libfc for Fibre Channel protocol processing and interacts with netdev to send/receive Ethernet packets. Signed-off-by: Robert Love Signed-off-by: James Bottomley --- drivers/scsi/Kconfig | 7 + drivers/scsi/Makefile | 1 + drivers/scsi/fcoe/Makefile | 8 + drivers/scsi/fcoe/fc_transport_fcoe.c | 446 ++++++++++ drivers/scsi/fcoe/fcoe_sw.c | 494 +++++++++++ drivers/scsi/fcoe/libfcoe.c | 1510 +++++++++++++++++++++++++++++++++ include/scsi/fc_transport_fcoe.h | 54 ++ include/scsi/libfcoe.h | 176 ++++ 8 files changed, 2696 insertions(+) create mode 100644 drivers/scsi/fcoe/Makefile create mode 100644 drivers/scsi/fcoe/fc_transport_fcoe.c create mode 100644 drivers/scsi/fcoe/fcoe_sw.c create mode 100644 drivers/scsi/fcoe/libfcoe.c create mode 100644 include/scsi/fc_transport_fcoe.h create mode 100644 include/scsi/libfcoe.h (limited to 'drivers/scsi/Kconfig') diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig index 24d762aab7c..673463e4bbf 100644 --- a/drivers/scsi/Kconfig +++ b/drivers/scsi/Kconfig @@ -609,6 +609,13 @@ config LIBFC ---help--- Fibre Channel library module +config FCOE + tristate "FCoE module" + depends on SCSI + select LIBFC + ---help--- + Fibre Channel over Ethernet module + config SCSI_DMX3191D tristate "DMX3191D SCSI support" depends on PCI && SCSI diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile index 87355f573d6..07d0f58de9b 100644 --- a/drivers/scsi/Makefile +++ b/drivers/scsi/Makefile @@ -37,6 +37,7 @@ obj-$(CONFIG_SCSI_SRP_ATTRS) += scsi_transport_srp.o obj-$(CONFIG_SCSI_DH) += device_handler/ obj-$(CONFIG_LIBFC) += libfc/ +obj-$(CONFIG_FCOE) += fcoe/ obj-$(CONFIG_ISCSI_TCP) += libiscsi.o libiscsi_tcp.o iscsi_tcp.o obj-$(CONFIG_INFINIBAND_ISER) += libiscsi.o obj-$(CONFIG_SCSI_A4000T) += 53c700.o a4000t.o diff --git a/drivers/scsi/fcoe/Makefile b/drivers/scsi/fcoe/Makefile new file mode 100644 index 00000000000..b78da06d7c0 --- /dev/null +++ b/drivers/scsi/fcoe/Makefile @@ -0,0 +1,8 @@ +# $Id: Makefile + +obj-$(CONFIG_FCOE) += fcoe.o + +fcoe-y := \ + libfcoe.o \ + fcoe_sw.o \ + fc_transport_fcoe.o diff --git a/drivers/scsi/fcoe/fc_transport_fcoe.c b/drivers/scsi/fcoe/fc_transport_fcoe.c new file mode 100644 index 00000000000..bf7fe6fc082 --- /dev/null +++ b/drivers/scsi/fcoe/fc_transport_fcoe.c @@ -0,0 +1,446 @@ +/* + * Copyright(c) 2007 - 2008 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + * + * Maintained at www.Open-FCoE.org + */ + +#include +#include +#include + +/* internal fcoe transport */ +struct fcoe_transport_internal { + struct fcoe_transport *t; + struct net_device *netdev; + struct list_head list; +}; + +/* fcoe transports list and its lock */ +static LIST_HEAD(fcoe_transports); +static DEFINE_MUTEX(fcoe_transports_lock); + +/** + * fcoe_transport_default - returns ptr to the default transport fcoe_sw + **/ +struct fcoe_transport *fcoe_transport_default(void) +{ + return &fcoe_sw_transport; +} + +/** + * fcoe_transport_to_pcidev - get the pci dev from a netdev + * @netdev: the netdev that pci dev will be retrived from + * + * Returns: NULL or the corrsponding pci_dev + **/ +struct pci_dev *fcoe_transport_pcidev(const struct net_device *netdev) +{ + if (!netdev->dev.parent) + return NULL; + return to_pci_dev(netdev->dev.parent); +} + +/** + * fcoe_transport_device_lookup - find out netdev is managed by the + * transport + * assign a transport to a device + * @netdev: the netdev the transport to be attached to + * + * This will look for existing offload driver, if not found, it falls back to + * the default sw hba (fcoe_sw) as its fcoe transport. + * + * Returns: 0 for success + **/ +static struct fcoe_transport_internal *fcoe_transport_device_lookup( + struct fcoe_transport *t, struct net_device *netdev) +{ + struct fcoe_transport_internal *ti; + + /* assign the transpor to this device */ + mutex_lock(&t->devlock); + list_for_each_entry(ti, &t->devlist, list) { + if (ti->netdev == netdev) { + mutex_unlock(&t->devlock); + return ti; + } + } + mutex_unlock(&t->devlock); + return NULL; +} +/** + * fcoe_transport_device_add - assign a transport to a device + * @netdev: the netdev the transport to be attached to + * + * This will look for existing offload driver, if not found, it falls back to + * the default sw hba (fcoe_sw) as its fcoe transport. + * + * Returns: 0 for success + **/ +static int fcoe_transport_device_add(struct fcoe_transport *t, + struct net_device *netdev) +{ + struct fcoe_transport_internal *ti; + + ti = fcoe_transport_device_lookup(t, netdev); + if (ti) { + printk(KERN_DEBUG "fcoe_transport_device_add:" + "device %s is already added to transport %s\n", + netdev->name, t->name); + return -EEXIST; + } + /* allocate an internal struct to host the netdev and the list */ + ti = kzalloc(sizeof(*ti), GFP_KERNEL); + if (!ti) + return -ENOMEM; + + ti->t = t; + ti->netdev = netdev; + INIT_LIST_HEAD(&ti->list); + dev_hold(ti->netdev); + + mutex_lock(&t->devlock); + list_add(&ti->list, &t->devlist); + mutex_unlock(&t->devlock); + + printk(KERN_DEBUG "fcoe_transport_device_add:" + "device %s added to transport %s\n", + netdev->name, t->name); + + return 0; +} + +/** + * fcoe_transport_device_remove - remove a device from its transport + * @netdev: the netdev the transport to be attached to + * + * this removes the device from the transport so the given transport will + * not manage this device any more + * + * Returns: 0 for success + **/ +static int fcoe_transport_device_remove(struct fcoe_transport *t, + struct net_device *netdev) +{ + struct fcoe_transport_internal *ti; + + ti = fcoe_transport_device_lookup(t, netdev); + if (!ti) { + printk(KERN_DEBUG "fcoe_transport_device_remove:" + "device %s is not managed by transport %s\n", + netdev->name, t->name); + return -ENODEV; + } + mutex_lock(&t->devlock); + list_del(&ti->list); + mutex_unlock(&t->devlock); + printk(KERN_DEBUG "fcoe_transport_device_remove:" + "device %s removed from transport %s\n", + netdev->name, t->name); + dev_put(ti->netdev); + kfree(ti); + return 0; +} + +/** + * fcoe_transport_device_remove_all - remove all from transport devlist + * + * this removes the device from the transport so the given transport will + * not manage this device any more + * + * Returns: 0 for success + **/ +static void fcoe_transport_device_remove_all(struct fcoe_transport *t) +{ + struct fcoe_transport_internal *ti, *tmp; + + mutex_lock(&t->devlock); + list_for_each_entry_safe(ti, tmp, &t->devlist, list) { + list_del(&ti->list); + kfree(ti); + } + mutex_unlock(&t->devlock); +} + +/** + * fcoe_transport_match - use the bus device match function to match the hw + * @t: the fcoe transport + * @netdev: + * + * This function is used to check if the givne transport wants to manage the + * input netdev. if the transports implements the match function, it will be + * called, o.w. we just compare the pci vendor and device id. + * + * Returns: true for match up + **/ +static bool fcoe_transport_match(struct fcoe_transport *t, + struct net_device *netdev) +{ + /* match transport by vendor and device id */ + struct pci_dev *pci; + + pci = fcoe_transport_pcidev(netdev); + + if (pci) { + printk(KERN_DEBUG "fcoe_transport_match:" + "%s:%x:%x -- %s:%x:%x\n", + t->name, t->vendor, t->device, + netdev->name, pci->vendor, pci->device); + + /* if transport supports match */ + if (t->match) + return t->match(netdev); + + /* else just compare the vendor and device id: pci only */ + return (t->vendor == pci->vendor) && (t->device == pci->device); + } + return false; +} + +/** + * fcoe_transport_lookup - check if the transport is already registered + * @t: the transport to be looked up + * + * This compares the parent device (pci) vendor and device id + * + * Returns: NULL if not found + * + * TODO - return default sw transport if no other transport is found + **/ +static struct fcoe_transport *fcoe_transport_lookup( + struct net_device *netdev) +{ + struct fcoe_transport *t; + + mutex_lock(&fcoe_transports_lock); + list_for_each_entry(t, &fcoe_transports, list) { + if (fcoe_transport_match(t, netdev)) { + mutex_unlock(&fcoe_transports_lock); + return t; + } + } + mutex_unlock(&fcoe_transports_lock); + + printk(KERN_DEBUG "fcoe_transport_lookup:" + "use default transport for %s\n", netdev->name); + return fcoe_transport_default(); +} + +/** + * fcoe_transport_register - adds a fcoe transport to the fcoe transports list + * @t: ptr to the fcoe transport to be added + * + * Returns: 0 for success + **/ +int fcoe_transport_register(struct fcoe_transport *t) +{ + struct fcoe_transport *tt; + + /* TODO - add fcoe_transport specific initialization here */ + mutex_lock(&fcoe_transports_lock); + list_for_each_entry(tt, &fcoe_transports, list) { + if (tt == t) { + mutex_unlock(&fcoe_transports_lock); + return -EEXIST; + } + } + list_add_tail(&t->list, &fcoe_transports); + mutex_unlock(&fcoe_transports_lock); + + mutex_init(&t->devlock); + INIT_LIST_HEAD(&t->devlist); + + printk(KERN_DEBUG "fcoe_transport_register:%s\n", t->name); + + return 0; +} +EXPORT_SYMBOL_GPL(fcoe_transport_register); + +/** + * fcoe_transport_unregister - remove the tranport fro the fcoe transports list + * @t: ptr to the fcoe transport to be removed + * + * Returns: 0 for success + **/ +int fcoe_transport_unregister(struct fcoe_transport *t) +{ + struct fcoe_transport *tt, *tmp; + + mutex_lock(&fcoe_transports_lock); + list_for_each_entry_safe(tt, tmp, &fcoe_transports, list) { + if (tt == t) { + list_del(&t->list); + mutex_unlock(&fcoe_transports_lock); + fcoe_transport_device_remove_all(t); + printk(KERN_DEBUG "fcoe_transport_unregister:%s\n", + t->name); + return 0; + } + } + mutex_unlock(&fcoe_transports_lock); + return -ENODEV; +} +EXPORT_SYMBOL_GPL(fcoe_transport_unregister); + +/* + * fcoe_load_transport_driver - load an offload driver by alias name + * @netdev: the target net device + * + * Requests for an offload driver module as the fcoe transport, if fails, it + * falls back to use the SW HBA (fcoe_sw) as its transport + * + * TODO - + * 1. supports only PCI device + * 2. needs fix for VLAn and bonding + * 3. pure hw fcoe hba may not have netdev + * + * Returns: 0 for success + **/ +int fcoe_load_transport_driver(struct net_device *netdev) +{ + struct pci_dev *pci; + struct device *dev = netdev->dev.parent; + + if (fcoe_transport_lookup(netdev)) { + /* load default transport */ + printk(KERN_DEBUG "fcoe: already loaded transport for %s\n", + netdev->name); + return -EEXIST; + } + + pci = to_pci_dev(dev); + if (dev->bus != &pci_bus_type) { + printk(KERN_DEBUG "fcoe: support noly PCI device\n"); + return -ENODEV; + } + printk(KERN_DEBUG "fcoe: loading driver fcoe-pci-0x%04x-0x%04x\n", + pci->vendor, pci->device); + + return request_module("fcoe-pci-0x%04x-0x%04x", + pci->vendor, pci->device); + +} +EXPORT_SYMBOL_GPL(fcoe_load_transport_driver); + +/** + * fcoe_transport_attach - load transport to fcoe + * @netdev: the netdev the transport to be attached to + * + * This will look for existing offload driver, if not found, it falls back to + * the default sw hba (fcoe_sw) as its fcoe transport. + * + * Returns: 0 for success + **/ +int fcoe_transport_attach(struct net_device *netdev) +{ + struct fcoe_transport *t; + + /* find the corresponding transport */ + t = fcoe_transport_lookup(netdev); + if (!t) { + printk(KERN_DEBUG "fcoe_transport_attach" + ":no transport for %s:use %s\n", + netdev->name, t->name); + return -ENODEV; + } + /* add to the transport */ + if (fcoe_transport_device_add(t, netdev)) { + printk(KERN_DEBUG "fcoe_transport_attach" + ":failed to add %s to tramsport %s\n", + netdev->name, t->name); + return -EIO; + } + /* transport create function */ + if (t->create) + t->create(netdev); + + printk(KERN_DEBUG "fcoe_transport_attach:transport %s for %s\n", + t->name, netdev->name); + return 0; +} +EXPORT_SYMBOL_GPL(fcoe_transport_attach); + +/** + * fcoe_transport_release - unload transport from fcoe + * @netdev: the net device on which fcoe is to be released + * + * Returns: 0 for success + **/ +int fcoe_transport_release(struct net_device *netdev) +{ + struct fcoe_transport *t; + + /* find the corresponding transport */ + t = fcoe_transport_lookup(netdev); + if (!t) { + printk(KERN_DEBUG "fcoe_transport_release:" + "no transport for %s:use %s\n", + netdev->name, t->name); + return -ENODEV; + } + /* remove the device from the transport */ + if (fcoe_transport_device_remove(t, netdev)) { + printk(KERN_DEBUG "fcoe_transport_release:" + "failed to add %s to tramsport %s\n", + netdev->name, t->name); + return -EIO; + } + /* transport destroy function */ + if (t->destroy) + t->destroy(netdev); + + printk(KERN_DEBUG "fcoe_transport_release:" + "device %s dettached from transport %s\n", + netdev->name, t->name); + + return 0; +} +EXPORT_SYMBOL_GPL(fcoe_transport_release); + +/** + * fcoe_transport_init - initializes fcoe transport layer + * + * This prepares for the fcoe transport layer + * + * Returns: none + **/ +int __init fcoe_transport_init(void) +{ + INIT_LIST_HEAD(&fcoe_transports); + mutex_init(&fcoe_transports_lock); + return 0; +} + +/** + * fcoe_transport_exit - cleans up the fcoe transport layer + * This cleans up the fcoe transport layer. removing any transport on the list, + * note that the transport destroy func is not called here. + * + * Returns: none + **/ +int __exit fcoe_transport_exit(void) +{ + struct fcoe_transport *t, *tmp; + + mutex_lock(&fcoe_transports_lock); + list_for_each_entry_safe(t, tmp, &fcoe_transports, list) { + list_del(&t->list); + mutex_unlock(&fcoe_transports_lock); + fcoe_transport_device_remove_all(t); + mutex_lock(&fcoe_transports_lock); + } + mutex_unlock(&fcoe_transports_lock); + return 0; +} diff --git a/drivers/scsi/fcoe/fcoe_sw.c b/drivers/scsi/fcoe/fcoe_sw.c new file mode 100644 index 00000000000..dc4cd5e2576 --- /dev/null +++ b/drivers/scsi/fcoe/fcoe_sw.c @@ -0,0 +1,494 @@ +/* + * Copyright(c) 2007 - 2008 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + * + * Maintained at www.Open-FCoE.org + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include +#include +#include + +#define FCOE_SW_VERSION "0.1" +#define FCOE_SW_NAME "fcoesw" +#define FCOE_SW_VENDOR "Open-FCoE.org" + +#define FCOE_MAX_LUN 255 +#define FCOE_MAX_FCP_TARGET 256 + +#define FCOE_MAX_OUTSTANDING_COMMANDS 1024 + +#define FCOE_MIN_XID 0x0001 /* the min xid supported by fcoe_sw */ +#define FCOE_MAX_XID 0x07ef /* the max xid supported by fcoe_sw */ + +static struct scsi_transport_template *scsi_transport_fcoe_sw; + +struct fc_function_template fcoe_sw_transport_function = { + .show_host_node_name = 1, + .show_host_port_name = 1, + .show_host_supported_classes = 1, + .show_host_supported_fc4s = 1, + .show_host_active_fc4s = 1, + .show_host_maxframe_size = 1, + + .show_host_port_id = 1, + .show_host_supported_speeds = 1, + .get_host_speed = fc_get_host_speed, + .show_host_speed = 1, + .show_host_port_type = 1, + .get_host_port_state = fc_get_host_port_state, + .show_host_port_state = 1, + .show_host_symbolic_name = 1, + + .dd_fcrport_size = sizeof(struct fc_rport_libfc_priv), + .show_rport_maxframe_size = 1, + .show_rport_supported_classes = 1, + + .show_host_fabric_name = 1, + .show_starget_node_name = 1, + .show_starget_port_name = 1, + .show_starget_port_id = 1, + .set_rport_dev_loss_tmo = fc_set_rport_loss_tmo, + .show_rport_dev_loss_tmo = 1, + .get_fc_host_stats = fc_get_host_stats, + .issue_fc_host_lip = fcoe_reset, + + .terminate_rport_io = fc_rport_terminate_io, +}; + +static struct scsi_host_template fcoe_sw_shost_template = { + .module = THIS_MODULE, + .name = "FCoE Driver", + .proc_name = FCOE_SW_NAME, + .queuecommand = fc_queuecommand, + .eh_abort_handler = fc_eh_abort, + .eh_device_reset_handler = fc_eh_device_reset, + .eh_host_reset_handler = fc_eh_host_reset, + .slave_alloc = fc_slave_alloc, + .change_queue_depth = fc_change_queue_depth, + .change_queue_type = fc_change_queue_type, + .this_id = -1, + .cmd_per_lun = 32, + .can_queue = FCOE_MAX_OUTSTANDING_COMMANDS, + .use_clustering = ENABLE_CLUSTERING, + .sg_tablesize = SG_ALL, + .max_sectors = 0xffff, +}; + +/* + * fcoe_sw_lport_config - sets up the fc_lport + * @lp: ptr to the fc_lport + * @shost: ptr to the parent scsi host + * + * Returns: 0 for success + * + */ +static int fcoe_sw_lport_config(struct fc_lport *lp) +{ + int i = 0; + + lp->link_status = 0; + lp->max_retry_count = 3; + lp->e_d_tov = 2 * 1000; /* FC-FS default */ + lp->r_a_tov = 2 * 2 * 1000; + lp->service_params = (FCP_SPPF_INIT_FCN | FCP_SPPF_RD_XRDY_DIS | + FCP_SPPF_RETRY | FCP_SPPF_CONF_COMPL); + + /* + * allocate per cpu stats block + */ + for_each_online_cpu(i) + lp->dev_stats[i] = kzalloc(sizeof(struct fcoe_dev_stats), + GFP_KERNEL); + + /* lport fc_lport related configuration */ + fc_lport_config(lp); + + return 0; +} + +/* + * fcoe_sw_netdev_config - sets up fcoe_softc for lport and network + * related properties + * @lp : ptr to the fc_lport + * @netdev : ptr to the associated netdevice struct + * + * Must be called after fcoe_sw_lport_config() as it will use lport mutex + * + * Returns : 0 for success + * + */ +static int fcoe_sw_netdev_config(struct fc_lport *lp, struct net_device *netdev) +{ + u32 mfs; + u64 wwnn, wwpn; + struct fcoe_softc *fc; + u8 flogi_maddr[ETH_ALEN]; + + /* Setup lport private data to point to fcoe softc */ + fc = lport_priv(lp); + fc->lp = lp; + fc->real_dev = netdev; + fc->phys_dev = netdev; + + /* Require support for get_pauseparam ethtool op. */ + if (netdev->priv_flags & IFF_802_1Q_VLAN) + fc->phys_dev = vlan_dev_real_dev(netdev); + + /* Do not support for bonding device */ + if ((fc->real_dev->priv_flags & IFF_MASTER_ALB) || + (fc->real_dev->priv_flags & IFF_SLAVE_INACTIVE) || + (fc->real_dev->priv_flags & IFF_MASTER_8023AD)) { + return -EOPNOTSUPP; + } + + /* + * Determine max frame size based on underlying device and optional + * user-configured limit. If the MFS is too low, fcoe_link_ok() + * will return 0, so do this first. + */ + mfs = fc->real_dev->mtu - (sizeof(struct fcoe_hdr) + + sizeof(struct fcoe_crc_eof)); + if (fc_set_mfs(lp, mfs)) + return -EINVAL; + + lp->link_status = ~FC_PAUSE & ~FC_LINK_UP; + if (!fcoe_link_ok(lp)) + lp->link_status |= FC_LINK_UP; + + /* offload features support */ + if (fc->real_dev->features & NETIF_F_SG) + lp->sg_supp = 1; + + + skb_queue_head_init(&fc->fcoe_pending_queue); + + /* setup Source Mac Address */ + memcpy(fc->ctl_src_addr, fc->real_dev->dev_addr, + fc->real_dev->addr_len); + + wwnn = fcoe_wwn_from_mac(fc->real_dev->dev_addr, 1, 0); + fc_set_wwnn(lp, wwnn); + /* XXX - 3rd arg needs to be vlan id */ + wwpn = fcoe_wwn_from_mac(fc->real_dev->dev_addr, 2, 0); + fc_set_wwpn(lp, wwpn); + + /* + * Add FCoE MAC address as second unicast MAC address + * or enter promiscuous mode if not capable of listening + * for multiple unicast MACs. + */ + rtnl_lock(); + memcpy(flogi_maddr, (u8[6]) FC_FCOE_FLOGI_MAC, ETH_ALEN); + dev_unicast_add(fc->real_dev, flogi_maddr, ETH_ALEN); + rtnl_unlock(); + + /* + * setup the receive function from ethernet driver + * on the ethertype for the given device + */ + fc->fcoe_packet_type.func = fcoe_rcv; + fc->fcoe_packet_type.type = __constant_htons(ETH_P_FCOE); + fc->fcoe_packet_type.dev = fc->real_dev; + dev_add_pack(&fc->fcoe_packet_type); + + return 0; +} + +/* + * fcoe_sw_shost_config - sets up fc_lport->host + * @lp : ptr to the fc_lport + * @shost : ptr to the associated scsi host + * @dev : device associated to scsi host + * + * Must be called after fcoe_sw_lport_config) and fcoe_sw_netdev_config() + * + * Returns : 0 for success + * + */ +static int fcoe_sw_shost_config(struct fc_lport *lp, struct Scsi_Host *shost, + struct device *dev) +{ + int rc = 0; + + /* lport scsi host config */ + lp->host = shost; + + lp->host->max_lun = FCOE_MAX_LUN; + lp->host->max_id = FCOE_MAX_FCP_TARGET; + lp->host->max_channel = 0; + lp->host->transportt = scsi_transport_fcoe_sw; + + /* add the new host to the SCSI-ml */ + rc = scsi_add_host(lp->host, dev); + if (rc) { + FC_DBG("fcoe_sw_shost_config:error on scsi_add_host\n"); + return rc; + } + sprintf(fc_host_symbolic_name(lp->host), "%s v%s over %s", + FCOE_SW_NAME, FCOE_SW_VERSION, + fcoe_netdev(lp)->name); + + return 0; +} + +/* + * fcoe_sw_em_config - allocates em for this lport + * @lp: the port that em is to allocated for + * + * Returns : 0 on success + */ +static inline int fcoe_sw_em_config(struct fc_lport *lp) +{ + BUG_ON(lp->emp); + + lp->emp = fc_exch_mgr_alloc(lp, FC_CLASS_3, + FCOE_MIN_XID, FCOE_MAX_XID); + if (!lp->emp) + return -ENOMEM; + + return 0; +} + +/* + * fcoe_sw_destroy - FCoE software HBA tear-down function + * @netdev: ptr to the associated net_device + * + * Returns: 0 if link is OK for use by FCoE. + */ +static int fcoe_sw_destroy(struct net_device *netdev) +{ + int cpu; + struct fc_lport *lp = NULL; + struct fcoe_softc *fc; + u8 flogi_maddr[ETH_ALEN]; + + BUG_ON(!netdev); + + printk(KERN_DEBUG "fcoe_sw_destroy:interface on %s\n", + netdev->name); + + lp = fcoe_hostlist_lookup(netdev); + if (!lp) + return -ENODEV; + + fc = fcoe_softc(lp); + + /* Logout of the fabric */ + fc_fabric_logoff(lp); + + /* Remove the instance from fcoe's list */ + fcoe_hostlist_remove(lp); + + /* Don't listen for Ethernet packets anymore */ + dev_remove_pack(&fc->fcoe_packet_type); + + /* Cleanup the fc_lport */ + fc_lport_destroy(lp); + fc_fcp_destroy(lp); + + /* Detach from the scsi-ml */ + fc_remove_host(lp->host); + scsi_remove_host(lp->host); + + /* There are no more rports or I/O, free the EM */ + if (lp->emp) + fc_exch_mgr_free(lp->emp); + + /* Delete secondary MAC addresses */ + rtnl_lock(); + memcpy(flogi_maddr, (u8[6]) FC_FCOE_FLOGI_MAC, ETH_ALEN); + dev_unicast_delete(fc->real_dev, flogi_maddr, ETH_ALEN); + if (compare_ether_addr(fc->data_src_addr, (u8[6]) { 0 })) + dev_unicast_delete(fc->real_dev, fc->data_src_addr, ETH_ALEN); + rtnl_unlock(); + + /* Free the per-CPU revieve threads */ + fcoe_percpu_clean(lp); + + /* Free existing skbs */ + fcoe_clean_pending_queue(lp); + + /* Free memory used by statistical counters */ + for_each_online_cpu(cpu) + kfree(lp->dev_stats[cpu]); + + /* Release the net_device and Scsi_Host */ + dev_put(fc->real_dev); + scsi_host_put(lp->host); + + return 0; +} + +static struct libfc_function_template fcoe_sw_libfc_fcn_templ = { + .frame_send = fcoe_xmit, +}; + +/* + * fcoe_sw_create - this function creates the fcoe interface + * @netdev: pointer the associated netdevice + * + * Creates fc_lport struct and scsi_host for lport, configures lport + * and starts fabric login. + * + * Returns : 0 on success + */ +static int fcoe_sw_create(struct net_device *netdev) +{ + int rc; + struct fc_lport *lp = NULL; + struct fcoe_softc *fc; + struct Scsi_Host *shost; + + BUG_ON(!netdev); + + printk(KERN_DEBUG "fcoe_sw_create:interface on %s\n", + netdev->name); + + lp = fcoe_hostlist_lookup(netdev); + if (lp) + return -EEXIST; + + shost = fcoe_host_alloc(&fcoe_sw_shost_template, + sizeof(struct fcoe_softc)); + if (!shost) { + FC_DBG("Could not allocate host structure\n"); + return -ENOMEM; + } + lp = shost_priv(shost); + fc = lport_priv(lp); + + /* configure fc_lport, e.g., em */ + rc = fcoe_sw_lport_config(lp); + if (rc) { + FC_DBG("Could not configure lport\n"); + goto out_host_put; + } + + /* configure lport network properties */ + rc = fcoe_sw_netdev_config(lp, netdev); + if (rc) { + FC_DBG("Could not configure netdev for lport\n"); + goto out_host_put; + } + + /* configure lport scsi host properties */ + rc = fcoe_sw_shost_config(lp, shost, &netdev->dev); + if (rc) { + FC_DBG("Could not configure shost for lport\n"); + goto out_host_put; + } + + /* lport exch manager allocation */ + rc = fcoe_sw_em_config(lp); + if (rc) { + FC_DBG("Could not configure em for lport\n"); + goto out_host_put; + } + + /* Initialize the library */ + rc = fcoe_libfc_config(lp, &fcoe_sw_libfc_fcn_templ); + if (rc) { + FC_DBG("Could not configure libfc for lport!\n"); + goto out_lp_destroy; + } + + /* add to lports list */ + fcoe_hostlist_add(lp); + + lp->boot_time = jiffies; + + fc_fabric_login(lp); + + dev_hold(netdev); + + return rc; + +out_lp_destroy: + fc_exch_mgr_free(lp->emp); /* Free the EM */ +out_host_put: + scsi_host_put(lp->host); + return rc; +} + +/* + * fcoe_sw_match - the fcoe sw transport match function + * + * Returns : false always + */ +static bool fcoe_sw_match(struct net_device *netdev) +{ + /* FIXME - for sw transport, always return false */ + return false; +} + +/* the sw hba fcoe transport */ +struct fcoe_transport fcoe_sw_transport = { + .name = "fcoesw", + .create = fcoe_sw_create, + .destroy = fcoe_sw_destroy, + .match = fcoe_sw_match, + .vendor = 0x0, + .device = 0xffff, +}; + +/* + * fcoe_sw_init - registers fcoe_sw_transport + * + * Returns : 0 on success + */ +int __init fcoe_sw_init(void) +{ + /* attach to scsi transport */ + scsi_transport_fcoe_sw = + fc_attach_transport(&fcoe_sw_transport_function); + if (!scsi_transport_fcoe_sw) { + printk(KERN_ERR "fcoe_sw_init:fc_attach_transport() failed\n"); + return -ENODEV; + } + /* register sw transport */ + fcoe_transport_register(&fcoe_sw_transport); + return 0; +} + +/* + * fcoe_sw_exit - unregisters fcoe_sw_transport + * + * Returns : 0 on success + */ +int __exit fcoe_sw_exit(void) +{ + /* dettach the transport */ + fc_release_transport(scsi_transport_fcoe_sw); + fcoe_transport_unregister(&fcoe_sw_transport); + return 0; +} diff --git a/drivers/scsi/fcoe/libfcoe.c b/drivers/scsi/fcoe/libfcoe.c new file mode 100644 index 00000000000..1cb549c4fac --- /dev/null +++ b/drivers/scsi/fcoe/libfcoe.c @@ -0,0 +1,1510 @@ +/* + * Copyright(c) 2007 - 2008 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + * + * Maintained at www.Open-FCoE.org + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include +#include +#include +#include + +static int debug_fcoe; + +#define FCOE_MAX_QUEUE_DEPTH 256 + +/* destination address mode */ +#define FCOE_GW_ADDR_MODE 0x00 +#define FCOE_FCOUI_ADDR_MODE 0x01 + +#define FCOE_WORD_TO_BYTE 4 + +MODULE_AUTHOR("Open-FCoE.org"); +MODULE_DESCRIPTION("FCoE"); +MODULE_LICENSE("GPL"); + +/* fcoe host list */ +LIST_HEAD(fcoe_hostlist); +DEFINE_RWLOCK(fcoe_hostlist_lock); +DEFINE_TIMER(fcoe_timer, NULL, 0, 0); +struct fcoe_percpu_s *fcoe_percpu[NR_CPUS]; + + +/* Function Prototyes */ +static int fcoe_check_wait_queue(struct fc_lport *); +static void fcoe_insert_wait_queue_head(struct fc_lport *, struct sk_buff *); +static void fcoe_insert_wait_queue(struct fc_lport *, struct sk_buff *); +static void fcoe_recv_flogi(struct fcoe_softc *, struct fc_frame *, u8 *); +#ifdef CONFIG_HOTPLUG_CPU +static int fcoe_cpu_callback(struct notifier_block *, ulong, void *); +#endif /* CONFIG_HOTPLUG_CPU */ +static int fcoe_device_notification(struct notifier_block *, ulong, void *); +static void fcoe_dev_setup(void); +static void fcoe_dev_cleanup(void); + +/* notification function from net device */ +static struct notifier_block fcoe_notifier = { + .notifier_call = fcoe_device_notification, +}; + + +#ifdef CONFIG_HOTPLUG_CPU +static struct notifier_block fcoe_cpu_notifier = { + .notifier_call = fcoe_cpu_callback, +}; + +/** + * fcoe_create_percpu_data - creates the associated cpu data + * @cpu: index for the cpu where fcoe cpu data will be created + * + * create percpu stats block, from cpu add notifier + * + * Returns: none + **/ +static void fcoe_create_percpu_data(int cpu) +{ + struct fc_lport *lp; + struct fcoe_softc *fc; + + write_lock_bh(&fcoe_hostlist_lock); + list_for_each_entry(fc, &fcoe_hostlist, list) { + lp = fc->lp; + if (lp->dev_stats[cpu] == NULL) + lp->dev_stats[cpu] = + kzalloc(sizeof(struct fcoe_dev_stats), + GFP_KERNEL); + } + write_unlock_bh(&fcoe_hostlist_lock); +} + +/** + * fcoe_destroy_percpu_data - destroys the associated cpu data + * @cpu: index for the cpu where fcoe cpu data will destroyed + * + * destroy percpu stats block called by cpu add/remove notifier + * + * Retuns: none + **/ +static void fcoe_destroy_percpu_data(int cpu) +{ + struct fc_lport *lp; + struct fcoe_softc *fc; + + write_lock_bh(&fcoe_hostlist_lock); + list_for_each_entry(fc, &fcoe_hostlist, list) { + lp = fc->lp; + kfree(lp->dev_stats[cpu]); + lp->dev_stats[cpu] = NULL; + } + write_unlock_bh(&fcoe_hostlist_lock); +} + +/** + * fcoe_cpu_callback - fcoe cpu hotplug event callback + * @nfb: callback data block + * @action: event triggering the callback + * @hcpu: index for the cpu of this event + * + * this creates or destroys per cpu data for fcoe + * + * Returns NOTIFY_OK always. + **/ +static int fcoe_cpu_callback(struct notifier_block *nfb, unsigned long action, + void *hcpu) +{ + unsigned int cpu = (unsigned long)hcpu; + + switch (action) { + case CPU_ONLINE: + fcoe_create_percpu_data(cpu); + break; + case CPU_DEAD: + fcoe_destroy_percpu_data(cpu); + break; + default: + break; + } + return NOTIFY_OK; +} +#endif /* CONFIG_HOTPLUG_CPU */ + +/** + * foce_rcv - this is the fcoe receive function called by NET_RX_SOFTIRQ + * @skb: the receive skb + * @dev: associated net device + * @ptype: context + * @odldev: last device + * + * this function will receive the packet and build fc frame and pass it up + * + * Returns: 0 for success + **/ +int fcoe_rcv(struct sk_buff *skb, struct net_device *dev, + struct packet_type *ptype, struct net_device *olddev) +{ + struct fc_lport *lp; + struct fcoe_rcv_info *fr; + struct fcoe_softc *fc; + struct fcoe_dev_stats *stats; + struct fc_frame_header *fh; + unsigned short oxid; + int cpu_idx; + struct fcoe_percpu_s *fps; + + fc = container_of(ptype, struct fcoe_softc, fcoe_packet_type); + lp = fc->lp; + if (unlikely(lp == NULL)) { + FC_DBG("cannot find hba structure"); + goto err2; + } + + if (unlikely(debug_fcoe)) { + FC_DBG("skb_info: len:%d data_len:%d head:%p data:%p tail:%p " + "end:%p sum:%d dev:%s", skb->len, skb->data_len, + skb->head, skb->data, skb_tail_pointer(skb), + skb_end_pointer(skb), skb->csum, + skb->dev ? skb->dev->name : ""); + + } + + /* check for FCOE packet type */ + if (unlikely(eth_hdr(skb)->h_proto != htons(ETH_P_FCOE))) { + FC_DBG("wrong FC type frame"); + goto err; + } + + /* + * Check for minimum frame length, and make sure required FCoE + * and FC headers are pulled into the linear data area. + */ + if (unlikely((skb->len < FCOE_MIN_FRAME) || + !pskb_may_pull(skb, FCOE_HEADER_LEN))) + goto err; + + skb_set_transport_header(skb, sizeof(struct fcoe_hdr)); + fh = (struct fc_frame_header *) skb_transport_header(skb); + + oxid = ntohs(fh->fh_ox_id); + + fr = fcoe_dev_from_skb(skb); + fr->fr_dev = lp; + fr->ptype = ptype; + cpu_idx = 0; +#ifdef CONFIG_SMP + /* + * The incoming frame exchange id(oxid) is ANDed with num of online + * cpu bits to get cpu_idx and then this cpu_idx is used for selecting + * a per cpu kernel thread from fcoe_percpu. In case the cpu is + * offline or no kernel thread for derived cpu_idx then cpu_idx is + * initialize to first online cpu index. + */ + cpu_idx = oxid & (num_online_cpus() - 1); + if (!fcoe_percpu[cpu_idx] || !cpu_online(cpu_idx)) + cpu_idx = first_cpu(cpu_online_map); +#endif + fps = fcoe_percpu[cpu_idx]; + + spin_lock_bh(&fps->fcoe_rx_list.lock); + __skb_queue_tail(&fps->fcoe_rx_list, skb); + if (fps->fcoe_rx_list.qlen == 1) + wake_up_process(fps->thread); + + spin_unlock_bh(&fps->fcoe_rx_list.lock); + + return 0; +err: +#ifdef CONFIG_SMP + stats = lp->dev_stats[smp_processor_id()]; +#else + stats = lp->dev_stats[0]; +#endif + if (stats) + stats->ErrorFrames++; + +err2: + kfree_skb(skb); + return -1; +} +EXPORT_SYMBOL_GPL(fcoe_rcv); + +/** + * fcoe_start_io - pass to netdev to start xmit for fcoe + * @skb: the skb to be xmitted + * + * Returns: 0 for success + **/ +static inline int fcoe_start_io(struct sk_buff *skb) +{ + int rc; + + skb_get(skb); + rc = dev_queue_xmit(skb); + if (rc != 0) + return rc; + kfree_skb(skb); + return 0; +} + +/** + * fcoe_get_paged_crc_eof - in case we need alloc a page for crc_eof + * @skb: the skb to be xmitted + * @tlen: total len + * + * Returns: 0 for success + **/ +static int fcoe_get_paged_crc_eof(struct sk_buff *skb, int tlen) +{ + struct fcoe_percpu_s *fps; + struct page *page; + int cpu_idx; + + cpu_idx = get_cpu(); + fps = fcoe_percpu[cpu_idx]; + page = fps->crc_eof_page; + if (!page) { + page = alloc_page(GFP_ATOMIC); + if (!page) { + put_cpu(); + return -ENOMEM; + } + fps->crc_eof_page = page; + WARN_ON(fps->crc_eof_offset != 0); + } + + get_page(page); + skb_fill_page_desc(skb, skb_shinfo(skb)->nr_frags, page, + fps->crc_eof_offset, tlen); + skb->len += tlen; + skb->data_len += tlen; + skb->truesize += tlen; + fps->crc_eof_offset += sizeof(struct fcoe_crc_eof); + + if (fps->crc_eof_offset >= PAGE_SIZE) { + fps->crc_eof_page = NULL; + fps->crc_eof_offset = 0; + put_page(page); + } + put_cpu(); + return 0; +} + +/** + * fcoe_fc_crc - calculates FC CRC in this fcoe skb + * @fp: the fc_frame containg data to be checksummed + * + * This uses crc32() to calculate the crc for fc frame + * Return : 32 bit crc + * + **/ +u32 fcoe_fc_crc(struct fc_frame *fp) +{ + struct sk_buff *skb = fp_skb(fp); + struct skb_frag_struct *frag; + unsigned char *data; + unsigned long off, len, clen; + u32 crc; + unsigned i; + + crc = crc32(~0, skb->data, skb_headlen(skb)); + + for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { + frag = &skb_shinfo(skb)->frags[i]; + off = frag->page_offset; + len = frag->size; + while (len > 0) { + clen = min(len, PAGE_SIZE - (off & ~PAGE_MASK)); + data = kmap_atomic(frag->page + (off >> PAGE_SHIFT), + KM_SKB_DATA_SOFTIRQ); + crc = crc32(crc, data + (off & ~PAGE_MASK), clen); + kunmap_atomic(data, KM_SKB_DATA_SOFTIRQ); + off += clen; + len -= clen; + } + } + return crc; +} +EXPORT_SYMBOL_GPL(fcoe_fc_crc); + +/** + * fcoe_xmit - FCoE frame transmit function + * @lp: the associated local port + * @fp: the fc_frame to be transmitted + * + * Return : 0 for success + * + **/ +int fcoe_xmit(struct fc_lport *lp, struct fc_frame *fp) +{ + int wlen, rc = 0; + u32 crc; + struct ethhdr *eh; + struct fcoe_crc_eof *cp; + struct sk_buff *skb; + struct fcoe_dev_stats *stats; + struct fc_frame_header *fh; + unsigned int hlen; /* header length implies the version */ + unsigned int tlen; /* trailer length */ + unsigned int elen; /* eth header, may include vlan */ + int flogi_in_progress = 0; + struct fcoe_softc *fc; + u8 sof, eof; + struct fcoe_hdr *hp; + + WARN_ON((fr_len(fp) % sizeof(u32)) != 0); + + fc = fcoe_softc(lp); + /* + * if it is a flogi then we need to learn gw-addr + * and my own fcid + */ + fh = fc_frame_header_get(fp); + if (unlikely(fh->fh_r_ctl == FC_RCTL_ELS_REQ)) { + if (fc_frame_payload_op(fp) == ELS_FLOGI) { + fc->flogi_oxid = ntohs(fh->fh_ox_id); + fc->address_mode = FCOE_FCOUI_ADDR_MODE; + fc->flogi_progress = 1; + flogi_in_progress = 1; + } else if (fc->flogi_progress && ntoh24(fh->fh_s_id) != 0) { + /* + * Here we must've gotten an SID by accepting an FLOGI + * from a point-to-point connection. Switch to using + * the source mac based on the SID. The destination + * MAC in this case would have been set by receving the + * FLOGI. + */ + fc_fcoe_set_mac(fc->data_src_addr, fh->fh_s_id); + fc->flogi_progress = 0; + } + } + + skb = fp_skb(fp); + sof = fr_sof(fp); + eof = fr_eof(fp); + + elen = (fc->real_dev->priv_flags & IFF_802_1Q_VLAN) ? + sizeof(struct vlan_ethhdr) : sizeof(struct ethhdr); + hlen = sizeof(struct fcoe_hdr); + tlen = sizeof(struct fcoe_crc_eof); + wlen = (skb->len - tlen + sizeof(crc)) / FCOE_WORD_TO_BYTE; + + /* crc offload */ + if (likely(lp->crc_offload)) { + skb->ip_summed = CHECKSUM_COMPLETE; + skb->csum_start = skb_headroom(skb); + skb->csum_offset = skb->len; + crc = 0; + } else { + skb->ip_summed = CHECKSUM_NONE; + crc = fcoe_fc_crc(fp); + } + + /* copy fc crc and eof to the skb buff */ + if (skb_is_nonlinear(skb)) { + skb_frag_t *frag; + if (fcoe_get_paged_crc_eof(skb, tlen)) { + kfree(skb); + return -ENOMEM; + } + frag = &skb_shinfo(skb)->frags[skb_shinfo(skb)->nr_frags - 1]; + cp = kmap_atomic(frag->page, KM_SKB_DATA_SOFTIRQ) + + frag->page_offset; + } else { + cp = (struct fcoe_crc_eof *)skb_put(skb, tlen); + } + + memset(cp, 0, sizeof(*cp)); + cp->fcoe_eof = eof; + cp->fcoe_crc32 = cpu_to_le32(~crc); + + if (skb_is_nonlinear(skb)) { + kunmap_atomic(cp, KM_SKB_DATA_SOFTIRQ); + cp = NULL; + } + + /* adjust skb netowrk/transport offsets to match mac/fcoe/fc */ + skb_push(skb, elen + hlen); + skb_reset_mac_header(skb); + skb_reset_network_header(skb); + skb->mac_len = elen; + skb->protocol = htons(ETH_P_802_3); + skb->dev = fc->real_dev; + + /* fill up mac and fcoe headers */ + eh = eth_hdr(skb); + eh->h_proto = htons(ETH_P_FCOE); + if (fc->address_mode == FCOE_FCOUI_ADDR_MODE) + fc_fcoe_set_mac(eh->h_dest, fh->fh_d_id); + else + /* insert GW address */ + memcpy(eh->h_dest, fc->dest_addr, ETH_ALEN); + + if (unlikely(flogi_in_progress)) + memcpy(eh->h_source, fc->ctl_src_addr, ETH_ALEN); + else + memcpy(eh->h_source, fc->data_src_addr, ETH_ALEN); + + hp = (struct fcoe_hdr *)(eh + 1); + memset(hp, 0, sizeof(*hp)); + if (FC_FCOE_VER) + FC_FCOE_ENCAPS_VER(hp, FC_FCOE_VER); + hp->fcoe_sof = sof; + + /* update tx stats: regardless if LLD fails */ + stats = lp->dev_stats[smp_processor_id()]; + if (stats) { + stats->TxFrames++; + stats->TxWords += wlen; + } + + /* send down to lld */ + fr_dev(fp) = lp; + if (fc->fcoe_pending_queue.qlen) + rc = fcoe_check_wait_queue(lp); + + if (rc == 0) + rc = fcoe_start_io(skb); + + if (rc) { + fcoe_insert_wait_queue(lp, skb); + if (fc->fcoe_pending_queue.qlen > FCOE_MAX_QUEUE_DEPTH) + fc_pause(lp); + } + + return 0; +} +EXPORT_SYMBOL_GPL(fcoe_xmit); + +/* + * fcoe_percpu_receive_thread - recv thread per cpu + * @arg: ptr to the fcoe per cpu struct + * + * Return: 0 for success + * + */ +int fcoe_percpu_receive_thread(void *arg) +{ + struct fcoe_percpu_s *p = arg; + u32 fr_len; + struct fc_lport *lp; + struct fcoe_rcv_info *fr; + struct fcoe_dev_stats *stats; + struct fc_frame_header *fh; + struct sk_buff *skb; + struct fcoe_crc_eof crc_eof; + struct fc_frame *fp; + u8 *mac = NULL; + struct fcoe_softc *fc; + struct fcoe_hdr *hp; + + set_user_nice(current, 19); + + while (!kthread_should_stop()) { + + spin_lock_bh(&p->fcoe_rx_list.lock); + while ((skb = __skb_dequeue(&p->fcoe_rx_list)) == NULL) { + set_current_state(TASK_INTERRUPTIBLE); + spin_unlock_bh(&p->fcoe_rx_list.lock); + schedule(); + set_current_state(TASK_RUNNING); + if (kthread_should_stop()) + return 0; + spin_lock_bh(&p->fcoe_rx_list.lock); + } + spin_unlock_bh(&p->fcoe_rx_list.lock); + fr = fcoe_dev_from_skb(skb); + lp = fr->fr_dev; + if (unlikely(lp == NULL)) { + FC_DBG("invalid HBA Structure"); + kfree_skb(skb); + continue; + } + + stats = lp->dev_stats[smp_processor_id()]; + + if (unlikely(debug_fcoe)) { + FC_DBG("skb_info: len:%d data_len:%d head:%p data:%p " + "tail:%p end:%p sum:%d dev:%s", + skb->len, skb->data_len, + skb->head, skb->data, skb_tail_pointer(skb), + skb_end_pointer(skb), skb->csum, + skb->dev ? skb->dev->name : ""); + } + + /* + * Save source MAC address before discarding header. + */ + fc = lport_priv(lp); + if (unlikely(fc->flogi_progress)) + mac = eth_hdr(skb)->h_source; + + if (skb_is_nonlinear(skb)) + skb_linearize(skb); /* not ideal */ + + /* + * Frame length checks and setting up the header pointers + * was done in fcoe_rcv already. + */ + hp = (struct fcoe_hdr *) skb_network_header(skb); + fh = (struct fc_frame_header *) skb_transport_header(skb); + + if (unlikely(FC_FCOE_DECAPS_VER(hp) != FC_FCOE_VER)) { + if (stats) { + if (stats->ErrorFrames < 5) + FC_DBG("unknown FCoE version %x", + FC_FCOE_DECAPS_VER(hp)); + stats->ErrorFrames++; + } + kfree_skb(skb); + continue; + } + + skb_pull(skb, sizeof(struct fcoe_hdr)); + fr_len = skb->len - sizeof(struct fcoe_crc_eof); + + if (stats) { + stats->RxFrames++; + stats->RxWords += fr_len / FCOE_WORD_TO_BYTE; + } + + fp = (struct fc_frame *)skb; + fc_frame_init(fp); + fr_dev(fp) = lp; + fr_sof(fp) = hp->fcoe_sof; + + /* Copy out the CRC and EOF trailer for access */ + if (skb_copy_bits(skb, fr_len, &crc_eof, sizeof(crc_eof))) { + kfree_skb(skb); + continue; + } + fr_eof(fp) = crc_eof.fcoe_eof; + fr_crc(fp) = crc_eof.fcoe_crc32; + if (pskb_trim(skb, fr_len)) { + kfree_skb(skb); + continue; + } + + /* + * We only check CRC if no offload is available and if it is + * it's solicited data, in which case, the FCP layer would + * check it during the copy. + */ + if (lp->crc_offload) + fr_flags(fp) &= ~FCPHF_CRC_UNCHECKED; + else + fr_flags(fp) |= FCPHF_CRC_UNCHECKED; + + fh = fc_frame_header_get(fp); + if (fh->fh_r_ctl == FC_RCTL_DD_SOL_DATA && + fh->fh_type == FC_TYPE_FCP) { + fc_exch_recv(lp, lp->emp, fp); + continue; + } + if (fr_flags(fp) & FCPHF_CRC_UNCHECKED) { + if (le32_to_cpu(fr_crc(fp)) != + ~crc32(~0, skb->data, fr_len)) { + if (debug_fcoe || stats->InvalidCRCCount < 5) + printk(KERN_WARNING "fcoe: dropping " + "frame with CRC error\n"); + stats->InvalidCRCCount++; + stats->ErrorFrames++; + fc_frame_free(fp); + continue; + } + fr_flags(fp) &= ~FCPHF_CRC_UNCHECKED; + } + /* non flogi and non data exchanges are handled here */ + if (unlikely(fc->flogi_progress)) + fcoe_recv_flogi(fc, fp, mac); + fc_exch_recv(lp, lp->emp, fp); + } + return 0; +} + +/** + * fcoe_recv_flogi - flogi receive function + * @fc: associated fcoe_softc + * @fp: the recieved frame + * @sa: the source address of this flogi + * + * This is responsible to parse the flogi response and sets the corresponding + * mac address for the initiator, eitehr OUI based or GW based. + * + * Returns: none + **/ +static void fcoe_recv_flogi(struct fcoe_softc *fc, struct fc_frame *fp, u8 *sa) +{ + struct fc_frame_header *fh; + u8 op; + + fh = fc_frame_header_get(fp); + if (fh->fh_type != FC_TYPE_ELS) + return; + op = fc_frame_payload_op(fp); + if (op == ELS_LS_ACC && fh->fh_r_ctl == FC_RCTL_ELS_REP && + fc->flogi_oxid == ntohs(fh->fh_ox_id)) { + /* + * FLOGI accepted. + * If the src mac addr is FC_OUI-based, then we mark the + * address_mode flag to use FC_OUI-based Ethernet DA. + * Otherwise we use the FCoE gateway addr + */ + if (!compare_ether_addr(sa, (u8[6]) FC_FCOE_FLOGI_MAC)) { + fc->address_mode = FCOE_FCOUI_ADDR_MODE; + } else { + memcpy(fc->dest_addr, sa, ETH_ALEN); + fc->address_mode = FCOE_GW_ADDR_MODE; + } + + /* + * Remove any previously-set unicast MAC filter. + * Add secondary FCoE MAC address filter for our OUI. + */ + rtnl_lock(); + if (compare_ether_addr(fc->data_src_addr, (u8[6]) { 0 })) + dev_unicast_delete(fc->real_dev, fc->data_src_addr, + ETH_ALEN); + fc_fcoe_set_mac(fc->data_src_addr, fh->fh_d_id); + dev_unicast_add(fc->real_dev, fc->data_src_addr, ETH_ALEN); + rtnl_unlock(); + + fc->flogi_progress = 0; + } else if (op == ELS_FLOGI && fh->fh_r_ctl == FC_RCTL_ELS_REQ && sa) { + /* + * Save source MAC for point-to-point responses. + */ + memcpy(fc->dest_addr, sa, ETH_ALEN); + fc->address_mode = FCOE_GW_ADDR_MODE; + } +} + +/** + * fcoe_watchdog - fcoe timer callback + * @vp: + * + * This checks the pending queue length for fcoe and put fcoe to be paused state + * if the FCOE_MAX_QUEUE_DEPTH is reached. This is done for all fc_lport on the + * fcoe_hostlist. + * + * Returns: 0 for success + **/ +void fcoe_watchdog(ulong vp) +{ + struct fc_lport *lp; + struct fcoe_softc *fc; + int paused = 0; + + read_lock(&fcoe_hostlist_lock); + list_for_each_entry(fc, &fcoe_hostlist, list) { + lp = fc->lp; + if (lp) { + if (fc->fcoe_pending_queue.qlen > FCOE_MAX_QUEUE_DEPTH) + paused = 1; + if (fcoe_check_wait_queue(lp) < FCOE_MAX_QUEUE_DEPTH) { + if (paused) + fc_unpause(lp); + } + } + } + read_unlock(&fcoe_hostlist_lock); + + fcoe_timer.expires = jiffies + (1 * HZ); + add_timer(&fcoe_timer); +} + + +/** + * fcoe_check_wait_queue - put the skb into fcoe pending xmit queue + * @lp: the fc_port for this skb + * @skb: the associated skb to be xmitted + * + * This empties the wait_queue, dequeue the head of the wait_queue queue + * and calls fcoe_start_io() for each packet, if all skb have been + * transmitted, return 0 if a error occurs, then restore wait_queue and + * try again later. + * + * The wait_queue is used when the skb transmit fails. skb will go + * in the wait_queue which will be emptied by the time function OR + * by the next skb transmit. + * + * Returns: 0 for success + **/ +static int fcoe_check_wait_queue(struct fc_lport *lp) +{ + int rc, unpause = 0; + int paused = 0; + struct sk_buff *skb; + struct fcoe_softc *fc; + + fc = fcoe_softc(lp); + spin_lock_bh(&fc->fcoe_pending_queue.lock); + + /* + * is this interface paused? + */ + if (fc->fcoe_pending_queue.qlen > FCOE_MAX_QUEUE_DEPTH) + paused = 1; + if (fc->fcoe_pending_queue.qlen) { + while ((skb = __skb_dequeue(&fc->fcoe_pending_queue)) != NULL) { + spin_unlock_bh(&fc->fcoe_pending_queue.lock); + rc = fcoe_start_io(skb); + if (rc) { + fcoe_insert_wait_queue_head(lp, skb); + return rc; + } + spin_lock_bh(&fc->fcoe_pending_queue.lock); + } + if (fc->fcoe_pending_queue.qlen < FCOE_MAX_QUEUE_DEPTH) + unpause = 1; + } + spin_unlock_bh(&fc->fcoe_pending_queue.lock); + if ((unpause) && (paused)) + fc_unpause(lp); + return fc->fcoe_pending_queue.qlen; +} + +/** + * fcoe_insert_wait_queue_head - puts skb to fcoe pending queue head + * @lp: the fc_port for this skb + * @skb: the associated skb to be xmitted + * + * Returns: none + **/ +static void fcoe_insert_wait_queue_head(struct fc_lport *lp, + struct sk_buff *skb) +{ + struct fcoe_softc *fc; + + fc = fcoe_softc(lp); + spin_lock_bh(&fc->fcoe_pending_queue.lock); + __skb_queue_head(&fc->fcoe_pending_queue, skb); + spin_unlock_bh(&fc->fcoe_pending_queue.lock); +} + +/** + * fcoe_insert_wait_queue - put the skb into fcoe pending queue tail + * @lp: the fc_port for this skb + * @skb: the associated skb to be xmitted + * + * Returns: none + **/ +static void fcoe_insert_wait_queue(struct fc_lport *lp, + struct sk_buff *skb) +{ + struct fcoe_softc *fc; + + fc = fcoe_softc(lp); + spin_lock_bh(&fc->fcoe_pending_queue.lock); + __skb_queue_tail(&fc->fcoe_pending_queue, skb); + spin_unlock_bh(&fc->fcoe_pending_queue.lock); +} + +/** + * fcoe_dev_setup - setup link change notification interface + * + **/ +static void fcoe_dev_setup(void) +{ + /* + * here setup a interface specific wd time to + * monitor the link state + */ + register_netdevice_notifier(&fcoe_notifier); +} + +/** + * fcoe_dev_setup - cleanup link change notification interface + **/ +static void fcoe_dev_cleanup(void) +{ + unregister_netdevice_notifier(&fcoe_notifier); +} + +/** + * fcoe_device_notification - netdev event notification callback + * @notifier: context of the notification + * @event: type of event + * @ptr: fixed array for output parsed ifname + * + * This function is called by the ethernet driver in case of link change event + * + * Returns: 0 for success + **/ +static int fcoe_device_notification(struct notifier_block *notifier, + ulong event, void *ptr) +{ + struct fc_lport *lp = NULL; + struct net_device *real_dev = ptr; + struct fcoe_softc *fc; + struct fcoe_dev_stats *stats; + u16 new_status; + u32 mfs; + int rc = NOTIFY_OK; + + read_lock(&fcoe_hostlist_lock); + list_for_each_entry(fc, &fcoe_hostlist, list) { + if (fc->real_dev == real_dev) { + lp = fc->lp; + break; + } + } + read_unlock(&fcoe_hostlist_lock); + if (lp == NULL) { + rc = NOTIFY_DONE; + goto out; + } + + new_status = lp->link_status; + switch (event) { + case NETDEV_DOWN: + case NETDEV_GOING_DOWN: + new_status &= ~FC_LINK_UP; + break; + case NETDEV_UP: + case NETDEV_CHANGE: + new_status &= ~FC_LINK_UP; + if (!fcoe_link_ok(lp)) + new_status |= FC_LINK_UP; + break; + case NETDEV_CHANGEMTU: + mfs = fc->real_dev->mtu - + (sizeof(struct fcoe_hdr) + + sizeof(struct fcoe_crc_eof)); + if (mfs >= FC_MIN_MAX_FRAME) + fc_set_mfs(lp, mfs); + new_status &= ~FC_LINK_UP; + if (!fcoe_link_ok(lp)) + new_status |= FC_LINK_UP; + break; + case NETDEV_REGISTER: + break; + default: + FC_DBG("unknown event %ld call", event); + } + if (lp->link_status != new_status) { + if ((new_status & FC_LINK_UP) == FC_LINK_UP) + fc_linkup(lp); + else { + stats = lp->dev_stats[smp_processor_id()]; + if (stats) + stats->LinkFailureCount++; + fc_linkdown(lp); + fcoe_clean_pending_queue(lp); + } + } +out: + return rc; +} + +/** + * fcoe_if_to_netdev - parse a name buffer to get netdev + * @ifname: fixed array for output parsed ifname + * @buffer: incoming buffer to be copied + * + * Returns: NULL or ptr to netdeive + **/ +static struct net_device *fcoe_if_to_netdev(const char *buffer) +{ + char *cp; + char ifname[IFNAMSIZ + 2]; + + if (buffer) { + strlcpy(ifname, buffer, IFNAMSIZ); + cp = ifname + strlen(ifname); + while (--cp >= ifname && *cp == '\n') + *cp = '\0'; + return dev_get_by_name(&init_net, ifname); + } + return NULL; +} + +/** + * fcoe_netdev_to_module_owner - finds out the nic drive moddule of the netdev + * @netdev: the target netdev + * + * Returns: ptr to the struct module, NULL for failure + **/ +static struct module *fcoe_netdev_to_module_owner( + const struct net_device *netdev) +{ + struct device *dev; + + if (!netdev) + return NULL; + + dev = netdev->dev.parent; + if (!dev) + return NULL; + + if (!dev->driver) + return NULL; + + return dev->driver->owner; +} + +/** + * fcoe_ethdrv_get - holds the nic driver module by try_module_get() for + * the corresponding netdev. + * @netdev: the target netdev + * + * Returns: 0 for succsss + **/ +static int fcoe_ethdrv_get(const struct net_device *netdev) +{ + struct module *owner; + + owner = fcoe_netdev_to_module_owner(netdev); + if (owner) { + printk(KERN_DEBUG "foce:hold driver module %s for %s\n", + owner->name, netdev->name); + return try_module_get(owner); + } + return -ENODEV; +} + +/** + * fcoe_ethdrv_get - releases the nic driver module by module_put for + * the corresponding netdev. + * @netdev: the target netdev + * + * Returns: 0 for succsss + **/ +static int fcoe_ethdrv_put(const struct net_device *netdev) +{ + struct module *owner; + + owner = fcoe_netdev_to_module_owner(netdev); + if (owner) { + printk(KERN_DEBUG "foce:release driver module %s for %s\n", + owner->name, netdev->name); + module_put(owner); + return 0; + } + return -ENODEV; +} + +/** + * fcoe_destroy- handles the destroy from sysfs + * @buffer: expcted to be a eth if name + * @kp: associated kernel param + * + * Returns: 0 for success + **/ +static int fcoe_destroy(const char *buffer, struct kernel_param *kp) +{ + int rc; + struct net_device *netdev; + + netdev = fcoe_if_to_netdev(buffer); + if (!netdev) { + rc = -ENODEV; + goto out_nodev; + } + /* look for existing lport */ + if (!fcoe_hostlist_lookup(netdev)) { + rc = -ENODEV; + goto out_putdev; + } + /* pass to transport */ + rc = fcoe_transport_release(netdev); + if (rc) { + printk(KERN_ERR "fcoe: fcoe_transport_release(%s) failed\n", + netdev->name); + rc = -EIO; + goto out_putdev; + } + fcoe_ethdrv_put(netdev); + rc = 0; +out_putdev: + dev_put(netdev); +out_nodev: + return rc; +} + +/** + * fcoe_create - handles the create call from sysfs + * @buffer: expcted to be a eth if name + * @kp: associated kernel param + * + * Returns: 0 for success + **/ +static int fcoe_create(const char *buffer, struct kernel_param *kp) +{ + int rc; + struct net_device *netdev; + + netdev = fcoe_if_to_netdev(buffer); + if (!netdev) { + rc = -ENODEV; + goto out_nodev; + } + /* look for existing lport */ + if (fcoe_hostlist_lookup(netdev)) { + rc = -EEXIST; + goto out_putdev; + } + fcoe_ethdrv_get(netdev); + + /* pass to transport */ + rc = fcoe_transport_attach(netdev); + if (rc) { + printk(KERN_ERR "fcoe: fcoe_transport_attach(%s) failed\n", + netdev->name); + fcoe_ethdrv_put(netdev); + rc = -EIO; + goto out_putdev; + } + rc = 0; +out_putdev: + dev_put(netdev); +out_nodev: + return rc; +} + +module_param_call(create, fcoe_create, NULL, NULL, S_IWUSR); +__MODULE_PARM_TYPE(create, "string"); +MODULE_PARM_DESC(create, "Create fcoe port using net device passed in."); +module_param_call(destroy, fcoe_destroy, NULL, NULL, S_IWUSR); +__MODULE_PARM_TYPE(destroy, "string"); +MODULE_PARM_DESC(destroy, "Destroy fcoe port"); + +/* + * fcoe_link_ok - check if link is ok for the fc_lport + * @lp: ptr to the fc_lport + * + * Any permanently-disqualifying conditions have been previously checked. + * This also updates the speed setting, which may change with link for 100/1000. + * + * This function should probably be checking for PAUSE support at some point + * in the future. Currently Per-priority-pause is not determinable using + * ethtool, so we shouldn't be restrictive until that problem is resolved. + * + * Returns: 0 if link is OK for use by FCoE. + * + */ +int fcoe_link_ok(struct fc_lport *lp) +{ + struct fcoe_softc *fc = fcoe_softc(lp); + struct net_device *dev = fc->real_dev; + struct ethtool_cmd ecmd = { ETHTOOL_GSET }; + int rc = 0; + + if ((dev->flags & IFF_UP) && netif_carrier_ok(dev)) { + dev = fc->phys_dev; + if (dev->ethtool_ops->get_settings) { + dev->ethtool_ops->get_settings(dev, &ecmd); + lp->link_supported_speeds &= + ~(FC_PORTSPEED_1GBIT | FC_PORTSPEED_10GBIT); + if (ecmd.supported & (SUPPORTED_1000baseT_Half | + SUPPORTED_1000baseT_Full)) + lp->link_supported_speeds |= FC_PORTSPEED_1GBIT; + if (ecmd.supported & SUPPORTED_10000baseT_Full) + lp->link_supported_speeds |= + FC_PORTSPEED_10GBIT; + if (ecmd.speed == SPEED_1000) + lp->link_speed = FC_PORTSPEED_1GBIT; + if (ecmd.speed == SPEED_10000) + lp->link_speed = FC_PORTSPEED_10GBIT; + } + } else + rc = -1; + + return rc; +} +EXPORT_SYMBOL_GPL(fcoe_link_ok); + +/* + * fcoe_percpu_clean - frees skb of the corresponding lport from the per + * cpu queue. + * @lp: the fc_lport + */ +void fcoe_percpu_clean(struct fc_lport *lp) +{ + int idx; + struct fcoe_percpu_s *pp; + struct fcoe_rcv_info *fr; + struct sk_buff_head *list; + struct sk_buff *skb, *next; + struct sk_buff *head; + + for (idx = 0; idx < NR_CPUS; idx++) { + if (fcoe_percpu[idx]) { + pp = fcoe_percpu[idx]; + spin_lock_bh(&pp->fcoe_rx_list.lock); + list = &pp->fcoe_rx_list; + head = list->next; + for (skb = head; skb != (struct sk_buff *)list; + skb = next) { + next = skb->next; + fr = fcoe_dev_from_skb(skb); + if (fr->fr_dev == lp) { + __skb_unlink(skb, list); + kfree_skb(skb); + } + } + spin_unlock_bh(&pp->fcoe_rx_list.lock); + } + } +} +EXPORT_SYMBOL_GPL(fcoe_percpu_clean); + +/** + * fcoe_clean_pending_queue - dequeue skb and free it + * @lp: the corresponding fc_lport + * + * Returns: none + **/ +void fcoe_clean_pending_queue(struct fc_lport *lp) +{ + struct fcoe_softc *fc = lport_priv(lp); + struct sk_buff *skb; + + spin_lock_bh(&fc->fcoe_pending_queue.lock); + while ((skb = __skb_dequeue(&fc->fcoe_pending_queue)) != NULL) { + spin_unlock_bh(&fc->fcoe_pending_queue.lock); + kfree_skb(skb); + spin_lock_bh(&fc->fcoe_pending_queue.lock); + } + spin_unlock_bh(&fc->fcoe_pending_queue.lock); +} +EXPORT_SYMBOL_GPL(fcoe_clean_pending_queue); + +/** + * libfc_host_alloc - allocate a Scsi_Host with room for the fc_lport + * @sht: ptr to the scsi host templ + * @priv_size: size of private data after fc_lport + * + * Returns: ptr to Scsi_Host + * TODO - to libfc? + */ +static inline struct Scsi_Host *libfc_host_alloc( + struct scsi_host_template *sht, int priv_size) +{ + return scsi_host_alloc(sht, sizeof(struct fc_lport) + priv_size); +} + +/** + * fcoe_host_alloc - allocate a Scsi_Host with room for the fcoe_softc + * @sht: ptr to the scsi host templ + * @priv_size: size of private data after fc_lport + * + * Returns: ptr to Scsi_Host + */ +struct Scsi_Host *fcoe_host_alloc(struct scsi_host_template *sht, int priv_size) +{ + return libfc_host_alloc(sht, sizeof(struct fcoe_softc) + priv_size); +} +EXPORT_SYMBOL_GPL(fcoe_host_alloc); + +/* + * fcoe_reset - resets the fcoe + * @shost: shost the reset is from + * + * Returns: always 0 + */ +int fcoe_reset(struct Scsi_Host *shost) +{ + struct fc_lport *lport = shost_priv(shost); + fc_lport_reset(lport); + return 0; +} +EXPORT_SYMBOL_GPL(fcoe_reset); + +/* + * fcoe_wwn_from_mac - converts 48-bit IEEE MAC address to 64-bit FC WWN. + * @mac: mac address + * @scheme: check port + * @port: port indicator for converting + * + * Returns: u64 fc world wide name + */ +u64 fcoe_wwn_from_mac(unsigned char mac[MAX_ADDR_LEN], + unsigned int scheme, unsigned int port) +{ + u64 wwn; + u64 host_mac; + + /* The MAC is in NO, so flip only the low 48 bits */ + host_mac = ((u64) mac[0] << 40) | + ((u64) mac[1] << 32) | + ((u64) mac[2] << 24) | + ((u64) mac[3] << 16) | + ((u64) mac[4] << 8) | + (u64) mac[5]; + + WARN_ON(host_mac >= (1ULL << 48)); + wwn = host_mac | ((u64) scheme << 60); + switch (scheme) { + case 1: + WARN_ON(port != 0); + break; + case 2: + WARN_ON(port >= 0xfff); + wwn |= (u64) port << 48; + break; + default: + WARN_ON(1); + break; + } + + return wwn; +} +EXPORT_SYMBOL_GPL(fcoe_wwn_from_mac); +/* + * fcoe_hostlist_lookup_softc - find the corresponding lport by a given device + * @device: this is currently ptr to net_device + * + * Returns: NULL or the located fcoe_softc + */ +static struct fcoe_softc *fcoe_hostlist_lookup_softc( + const struct net_device *dev) +{ + struct fcoe_softc *fc; + + read_lock(&fcoe_hostlist_lock); + list_for_each_entry(fc, &fcoe_hostlist, list) { + if (fc->real_dev == dev) { + read_unlock(&fcoe_hostlist_lock); + return fc; + } + } + read_unlock(&fcoe_hostlist_lock); + return NULL; +} + +/* + * fcoe_hostlist_lookup - find the corresponding lport by netdev + * @netdev: ptr to net_device + * + * Returns: 0 for success + */ +struct fc_lport *fcoe_hostlist_lookup(const struct net_device *netdev) +{ + struct fcoe_softc *fc; + + fc = fcoe_hostlist_lookup_softc(netdev); + + return (fc) ? fc->lp : NULL; +} +EXPORT_SYMBOL_GPL(fcoe_hostlist_lookup); + +/* + * fcoe_hostlist_add - add a lport to lports list + * @lp: ptr to the fc_lport to badded + * + * Returns: 0 for success + */ +int fcoe_hostlist_add(const struct fc_lport *lp) +{ + struct fcoe_softc *fc; + + fc = fcoe_hostlist_lookup_softc(fcoe_netdev(lp)); + if (!fc) { + fc = fcoe_softc(lp); + write_lock_bh(&fcoe_hostlist_lock); + list_add_tail(&fc->list, &fcoe_hostlist); + write_unlock_bh(&fcoe_hostlist_lock); + } + return 0; +} +EXPORT_SYMBOL_GPL(fcoe_hostlist_add); + +/* + * fcoe_hostlist_remove - remove a lport from lports list + * @lp: ptr to the fc_lport to badded + * + * Returns: 0 for success + */ +int fcoe_hostlist_remove(const struct fc_lport *lp) +{ + struct fcoe_softc *fc; + + fc = fcoe_hostlist_lookup_softc(fcoe_netdev(lp)); + BUG_ON(!fc); + write_lock_bh(&fcoe_hostlist_lock); + list_del(&fc->list); + write_unlock_bh(&fcoe_hostlist_lock); + + return 0; +} +EXPORT_SYMBOL_GPL(fcoe_hostlist_remove); + +/** + * fcoe_libfc_config - sets up libfc related properties for lport + * @lp: ptr to the fc_lport + * @tt: libfc function template + * + * Returns : 0 for success + **/ +int fcoe_libfc_config(struct fc_lport *lp, struct libfc_function_template *tt) +{ + /* Set the function pointers set by the LLDD */ + memcpy(&lp->tt, tt, sizeof(*tt)); + if (fc_fcp_init(lp)) + return -ENOMEM; + fc_exch_init(lp); + fc_elsct_init(lp); + fc_lport_init(lp); + fc_rport_init(lp); + fc_disc_init(lp); + + return 0; +} +EXPORT_SYMBOL_GPL(fcoe_libfc_config); + +/** + * fcoe_init - fcoe module loading initialization + * + * Initialization routine + * 1. Will create fc transport software structure + * 2. initialize the link list of port information structure + * + * Returns 0 on success, negative on failure + **/ +static int __init fcoe_init(void) +{ + int cpu; + struct fcoe_percpu_s *p; + + + INIT_LIST_HEAD(&fcoe_hostlist); + rwlock_init(&fcoe_hostlist_lock); + +#ifdef CONFIG_HOTPLUG_CPU + register_cpu_notifier(&fcoe_cpu_notifier); +#endif /* CONFIG_HOTPLUG_CPU */ + + /* + * initialize per CPU interrupt thread + */ + for_each_online_cpu(cpu) { + p = kzalloc(sizeof(struct fcoe_percpu_s), GFP_KERNEL); + if (p) { + p->thread = kthread_create(fcoe_percpu_receive_thread, + (void *)p, + "fcoethread/%d", cpu); + + /* + * if there is no error then bind the thread to the cpu + * initialize the semaphore and skb queue head + */ + if (likely(!IS_ERR(p->thread))) { + p->cpu = cpu; + fcoe_percpu[cpu] = p; + skb_queue_head_init(&p->fcoe_rx_list); + kthread_bind(p->thread, cpu); + wake_up_process(p->thread); + } else { + fcoe_percpu[cpu] = NULL; + kfree(p); + + } + } + } + + /* + * setup link change notification + */ + fcoe_dev_setup(); + + init_timer(&fcoe_timer); + fcoe_timer.data = 0; + fcoe_timer.function = fcoe_watchdog; + fcoe_timer.expires = (jiffies + (10 * HZ)); + add_timer(&fcoe_timer); + + /* initiatlize the fcoe transport */ + fcoe_transport_init(); + + fcoe_sw_init(); + + return 0; +} +module_init(fcoe_init); + +/** + * fcoe_exit - fcoe module unloading cleanup + * + * Returns 0 on success, negative on failure + **/ +static void __exit fcoe_exit(void) +{ + u32 idx; + struct fcoe_softc *fc, *tmp; + struct fcoe_percpu_s *p; + struct sk_buff *skb; + + /* + * Stop all call back interfaces + */ +#ifdef CONFIG_HOTPLUG_CPU + unregister_cpu_notifier(&fcoe_cpu_notifier); +#endif /* CONFIG_HOTPLUG_CPU */ + fcoe_dev_cleanup(); + + /* + * stop timer + */ + del_timer_sync(&fcoe_timer); + + /* releases the assocaited fcoe transport for each lport */ + list_for_each_entry_safe(fc, tmp, &fcoe_hostlist, list) + fcoe_transport_release(fc->real_dev); + + for (idx = 0; idx < NR_CPUS; idx++) { + if (fcoe_percpu[idx]) { + kthread_stop(fcoe_percpu[idx]->thread); + p = fcoe_percpu[idx]; + spin_lock_bh(&p->fcoe_rx_list.lock); + while ((skb = __skb_dequeue(&p->fcoe_rx_list)) != NULL) + kfree_skb(skb); + spin_unlock_bh(&p->fcoe_rx_list.lock); + if (fcoe_percpu[idx]->crc_eof_page) + put_page(fcoe_percpu[idx]->crc_eof_page); + kfree(fcoe_percpu[idx]); + } + } + + /* remove sw trasnport */ + fcoe_sw_exit(); + + /* detach the transport */ + fcoe_transport_exit(); +} +module_exit(fcoe_exit); diff --git a/include/scsi/fc_transport_fcoe.h b/include/scsi/fc_transport_fcoe.h new file mode 100644 index 00000000000..8dca2af14ff --- /dev/null +++ b/include/scsi/fc_transport_fcoe.h @@ -0,0 +1,54 @@ +#ifndef FC_TRANSPORT_FCOE_H +#define FC_TRANSPORT_FCOE_H + +#include +#include +#include +#include + +/** + * struct fcoe_transport - FCoE transport struct for generic transport + * for Ethernet devices as well as pure HBAs + * + * @name: name for thsi transport + * @bus: physical bus type (pci_bus_type) + * @driver: physical bus driver for network device + * @create: entry create function + * @destroy: exit destroy function + * @list: list of transports + */ +struct fcoe_transport { + char *name; + unsigned short vendor; + unsigned short device; + struct bus_type *bus; + struct device_driver *driver; + int (*create)(struct net_device *device); + int (*destroy)(struct net_device *device); + bool (*match)(struct net_device *device); + struct list_head list; + struct list_head devlist; + struct mutex devlock; +}; + +/** + * MODULE_ALIAS_FCOE_PCI + * + * some care must be taken with this, vendor and device MUST be a hex value + * preceded with 0x and with letters in lower case (0x12ab, not 0x12AB or 12AB) + */ +#define MODULE_ALIAS_FCOE_PCI(vendor, device) \ + MODULE_ALIAS("fcoe-pci-" __stringify(vendor) "-" __stringify(device)) + +/* exported funcs */ +int fcoe_transport_attach(struct net_device *netdev); +int fcoe_transport_release(struct net_device *netdev); +int fcoe_transport_register(struct fcoe_transport *t); +int fcoe_transport_unregister(struct fcoe_transport *t); +int fcoe_load_transport_driver(struct net_device *netdev); +int __init fcoe_transport_init(void); +int __exit fcoe_transport_exit(void); + +/* fcow_sw is the default transport */ +extern struct fcoe_transport fcoe_sw_transport; +#endif /* FC_TRANSPORT_FCOE_H */ diff --git a/include/scsi/libfcoe.h b/include/scsi/libfcoe.h new file mode 100644 index 00000000000..89fdbb9a6a1 --- /dev/null +++ b/include/scsi/libfcoe.h @@ -0,0 +1,176 @@ +/* + * Copyright(c) 2007 - 2008 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + * + * Maintained at www.Open-FCoE.org + */ + +#ifndef _LIBFCOE_H +#define _LIBFCOE_H + +#include +#include +#include +#include + +/* + * this percpu struct for fcoe + */ +struct fcoe_percpu_s { + int cpu; + struct task_struct *thread; + struct sk_buff_head fcoe_rx_list; + struct page *crc_eof_page; + int crc_eof_offset; +}; + +/* + * the fcoe sw transport private data + */ +struct fcoe_softc { + struct list_head list; + struct fc_lport *lp; + struct net_device *real_dev; + struct net_device *phys_dev; /* device with ethtool_ops */ + struct packet_type fcoe_packet_type; + struct sk_buff_head fcoe_pending_queue; + + u8 dest_addr[ETH_ALEN]; + u8 ctl_src_addr[ETH_ALEN]; + u8 data_src_addr[ETH_ALEN]; + /* + * fcoe protocol address learning related stuff + */ + u16 flogi_oxid; + u8 flogi_progress; + u8 address_mode; +}; + +static inline struct fcoe_softc *fcoe_softc( + const struct fc_lport *lp) +{ + return (struct fcoe_softc *)lport_priv(lp); +} + +static inline struct net_device *fcoe_netdev( + const struct fc_lport *lp) +{ + return fcoe_softc(lp)->real_dev; +} + +static inline struct fcoe_hdr *skb_fcoe_header(const struct sk_buff *skb) +{ + return (struct fcoe_hdr *)skb_network_header(skb); +} + +static inline int skb_fcoe_offset(const struct sk_buff *skb) +{ + return skb_network_offset(skb); +} + +static inline struct fc_frame_header *skb_fc_header(const struct sk_buff *skb) +{ + return (struct fc_frame_header *)skb_transport_header(skb); +} + +static inline int skb_fc_offset(const struct sk_buff *skb) +{ + return skb_transport_offset(skb); +} + +static inline void skb_reset_fc_header(struct sk_buff *skb) +{ + skb_reset_network_header(skb); + skb_set_transport_header(skb, skb_network_offset(skb) + + sizeof(struct fcoe_hdr)); +} + +static inline bool skb_fc_is_data(const struct sk_buff *skb) +{ + return skb_fc_header(skb)->fh_r_ctl == FC_RCTL_DD_SOL_DATA; +} + +static inline bool skb_fc_is_cmd(const struct sk_buff *skb) +{ + return skb_fc_header(skb)->fh_r_ctl == FC_RCTL_DD_UNSOL_CMD; +} + +static inline bool skb_fc_has_exthdr(const struct sk_buff *skb) +{ + return (skb_fc_header(skb)->fh_r_ctl == FC_RCTL_VFTH) || + (skb_fc_header(skb)->fh_r_ctl == FC_RCTL_IFRH) || + (skb_fc_header(skb)->fh_r_ctl == FC_RCTL_ENCH); +} + +static inline bool skb_fc_is_roff(const struct sk_buff *skb) +{ + return skb_fc_header(skb)->fh_f_ctl[2] & FC_FC_REL_OFF; +} + +static inline u16 skb_fc_oxid(const struct sk_buff *skb) +{ + return be16_to_cpu(skb_fc_header(skb)->fh_ox_id); +} + +static inline u16 skb_fc_rxid(const struct sk_buff *skb) +{ + return be16_to_cpu(skb_fc_header(skb)->fh_rx_id); +} + +/* FIXME - DMA_BIDIRECTIONAL ? */ +#define skb_cb(skb) ((struct fcoe_rcv_info *)&((skb)->cb[0])) +#define skb_cmd(skb) (skb_cb(skb)->fr_cmd) +#define skb_dir(skb) (skb_cmd(skb)->sc_data_direction) +static inline bool skb_fc_is_read(const struct sk_buff *skb) +{ + if (skb_fc_is_cmd(skb) && skb_cmd(skb)) + return skb_dir(skb) == DMA_FROM_DEVICE; + return false; +} + +static inline bool skb_fc_is_write(const struct sk_buff *skb) +{ + if (skb_fc_is_cmd(skb) && skb_cmd(skb)) + return skb_dir(skb) == DMA_TO_DEVICE; + return false; +} + +/* libfcoe funcs */ +int fcoe_reset(struct Scsi_Host *shost); +u64 fcoe_wwn_from_mac(unsigned char mac[MAX_ADDR_LEN], + unsigned int scheme, unsigned int port); + +u32 fcoe_fc_crc(struct fc_frame *fp); +int fcoe_xmit(struct fc_lport *, struct fc_frame *); +int fcoe_rcv(struct sk_buff *, struct net_device *, + struct packet_type *, struct net_device *); + +int fcoe_percpu_receive_thread(void *arg); +void fcoe_clean_pending_queue(struct fc_lport *lp); +void fcoe_percpu_clean(struct fc_lport *lp); +void fcoe_watchdog(ulong vp); +int fcoe_link_ok(struct fc_lport *lp); + +struct fc_lport *fcoe_hostlist_lookup(const struct net_device *); +int fcoe_hostlist_add(const struct fc_lport *); +int fcoe_hostlist_remove(const struct fc_lport *); + +struct Scsi_Host *fcoe_host_alloc(struct scsi_host_template *, int); +int fcoe_libfc_config(struct fc_lport *, struct libfc_function_template *); + +/* fcoe sw hba */ +int __init fcoe_sw_init(void); +int __exit fcoe_sw_exit(void); +#endif /* _LIBFCOE_H */ -- cgit v1.2.3-70-g09d2 From c3673464ebc004a3d82063cd41b9cf74d1b55db2 Mon Sep 17 00:00:00 2001 From: Karen Xie Date: Tue, 9 Dec 2008 14:15:32 -0800 Subject: [SCSI] cxgb3i: Add cxgb3i iSCSI driver. This patch implements the cxgb3i iscsi connection acceleration for the open-iscsi initiator. The cxgb3i driver offers the iscsi PDU based offload: - digest insertion and verification - payload direct-placement into host memory buffer. Signed-off-by: Karen Xie Signed-off-by: James Bottomley --- Documentation/scsi/cxgb3i.txt | 85 ++ drivers/scsi/Kconfig | 2 + drivers/scsi/Makefile | 1 + drivers/scsi/cxgb3i/Kbuild | 4 + drivers/scsi/cxgb3i/Kconfig | 6 + drivers/scsi/cxgb3i/cxgb3i.h | 139 +++ drivers/scsi/cxgb3i/cxgb3i_ddp.c | 770 +++++++++++++++ drivers/scsi/cxgb3i/cxgb3i_ddp.h | 306 ++++++ drivers/scsi/cxgb3i/cxgb3i_init.c | 107 ++ drivers/scsi/cxgb3i/cxgb3i_iscsi.c | 951 ++++++++++++++++++ drivers/scsi/cxgb3i/cxgb3i_offload.c | 1810 ++++++++++++++++++++++++++++++++++ drivers/scsi/cxgb3i/cxgb3i_offload.h | 231 +++++ drivers/scsi/cxgb3i/cxgb3i_pdu.c | 402 ++++++++ drivers/scsi/cxgb3i/cxgb3i_pdu.h | 59 ++ 14 files changed, 4873 insertions(+) create mode 100644 Documentation/scsi/cxgb3i.txt create mode 100644 drivers/scsi/cxgb3i/Kbuild create mode 100644 drivers/scsi/cxgb3i/Kconfig create mode 100644 drivers/scsi/cxgb3i/cxgb3i.h create mode 100644 drivers/scsi/cxgb3i/cxgb3i_ddp.c create mode 100644 drivers/scsi/cxgb3i/cxgb3i_ddp.h create mode 100644 drivers/scsi/cxgb3i/cxgb3i_init.c create mode 100644 drivers/scsi/cxgb3i/cxgb3i_iscsi.c create mode 100644 drivers/scsi/cxgb3i/cxgb3i_offload.c create mode 100644 drivers/scsi/cxgb3i/cxgb3i_offload.h create mode 100644 drivers/scsi/cxgb3i/cxgb3i_pdu.c create mode 100644 drivers/scsi/cxgb3i/cxgb3i_pdu.h (limited to 'drivers/scsi/Kconfig') diff --git a/Documentation/scsi/cxgb3i.txt b/Documentation/scsi/cxgb3i.txt new file mode 100644 index 00000000000..8141fa01978 --- /dev/null +++ b/Documentation/scsi/cxgb3i.txt @@ -0,0 +1,85 @@ +Chelsio S3 iSCSI Driver for Linux + +Introduction +============ + +The Chelsio T3 ASIC based Adapters (S310, S320, S302, S304, Mezz cards, etc. +series of products) supports iSCSI acceleration and iSCSI Direct Data Placement +(DDP) where the hardware handles the expensive byte touching operations, such +as CRC computation and verification, and direct DMA to the final host memory +destination: + + - iSCSI PDU digest generation and verification + + On transmitting, Chelsio S3 h/w computes and inserts the Header and + Data digest into the PDUs. + On receiving, Chelsio S3 h/w computes and verifies the Header and + Data digest of the PDUs. + + - Direct Data Placement (DDP) + + S3 h/w can directly place the iSCSI Data-In or Data-Out PDU's + payload into pre-posted final destination host-memory buffers based + on the Initiator Task Tag (ITT) in Data-In or Target Task Tag (TTT) + in Data-Out PDUs. + + - PDU Transmit and Recovery + + On transmitting, S3 h/w accepts the complete PDU (header + data) + from the host driver, computes and inserts the digests, decomposes + the PDU into multiple TCP segments if necessary, and transmit all + the TCP segments onto the wire. It handles TCP retransmission if + needed. + + On receving, S3 h/w recovers the iSCSI PDU by reassembling TCP + segments, separating the header and data, calculating and verifying + the digests, then forwards the header to the host. The payload data, + if possible, will be directly placed into the pre-posted host DDP + buffer. Otherwise, the payload data will be sent to the host too. + +The cxgb3i driver interfaces with open-iscsi initiator and provides the iSCSI +acceleration through Chelsio hardware wherever applicable. + +Using the cxgb3i Driver +======================= + +The following steps need to be taken to accelerates the open-iscsi initiator: + +1. Load the cxgb3i driver: "modprobe cxgb3i" + + The cxgb3i module registers a new transport class "cxgb3i" with open-iscsi. + + * in the case of recompiling the kernel, the cxgb3i selection is located at + Device Drivers + SCSI device support ---> + [*] SCSI low-level drivers ---> + Chelsio S3xx iSCSI support + +2. Create an interface file located under /etc/iscsi/ifaces/ for the new + transport class "cxgb3i". + + The content of the file should be in the following format: + iface.transport_name = cxgb3i + iface.net_ifacename = + iface.ipaddress = + + * if iface.ipaddress is specified, needs to be either the + same as the ethX's ip address or an address on the same subnet. Make + sure the ip address is unique in the network. + +3. edit /etc/iscsi/iscsid.conf + The default setting for MaxRecvDataSegmentLength (131072) is too big, + replace "node.conn[0].iscsi.MaxRecvDataSegmentLength" to be a value no + bigger than 15360 (for example 8192): + + node.conn[0].iscsi.MaxRecvDataSegmentLength = 8192 + + * The login would fail for a normal session if MaxRecvDataSegmentLength is + too big. A error message in the format of + "cxgb3i: ERR! MaxRecvSegmentLength too big. Need to be <= ." + would be logged to dmesg. + +4. To direct open-iscsi traffic to go through cxgb3i's accelerated path, + "-I " option needs to be specified with most of the + iscsiadm command. is the transport interface file created + in step 2. diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig index 673463e4bbf..0e5e084dfb4 100644 --- a/drivers/scsi/Kconfig +++ b/drivers/scsi/Kconfig @@ -352,6 +352,8 @@ config ISCSI_TCP http://open-iscsi.org +source "drivers/scsi/cxgb3i/Kconfig" + config SGIWD93_SCSI tristate "SGI WD93C93 SCSI Driver" depends on SGI_HAS_WD93 && SCSI diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile index 07d0f58de9b..1410697257c 100644 --- a/drivers/scsi/Makefile +++ b/drivers/scsi/Makefile @@ -126,6 +126,7 @@ obj-$(CONFIG_SCSI_HPTIOP) += hptiop.o obj-$(CONFIG_SCSI_STEX) += stex.o obj-$(CONFIG_SCSI_MVSAS) += mvsas.o obj-$(CONFIG_PS3_ROM) += ps3rom.o +obj-$(CONFIG_SCSI_CXGB3_ISCSI) += libiscsi.o libiscsi_tcp.o cxgb3i/ obj-$(CONFIG_ARM) += arm/ diff --git a/drivers/scsi/cxgb3i/Kbuild b/drivers/scsi/cxgb3i/Kbuild new file mode 100644 index 00000000000..ee7d6d2f9c3 --- /dev/null +++ b/drivers/scsi/cxgb3i/Kbuild @@ -0,0 +1,4 @@ +EXTRA_CFLAGS += -I$(TOPDIR)/drivers/net/cxgb3 + +cxgb3i-y := cxgb3i_init.o cxgb3i_iscsi.o cxgb3i_pdu.o cxgb3i_offload.o +obj-$(CONFIG_SCSI_CXGB3_ISCSI) += cxgb3i_ddp.o cxgb3i.o diff --git a/drivers/scsi/cxgb3i/Kconfig b/drivers/scsi/cxgb3i/Kconfig new file mode 100644 index 00000000000..276281460ec --- /dev/null +++ b/drivers/scsi/cxgb3i/Kconfig @@ -0,0 +1,6 @@ +config SCSI_CXGB3_ISCSI + tristate "Chelsio S3xx iSCSI support" + select CHELSIO_T3 + select SCSI_ISCSI_ATTRS + ---help--- + This driver supports iSCSI offload for the Chelsio S3 series devices. diff --git a/drivers/scsi/cxgb3i/cxgb3i.h b/drivers/scsi/cxgb3i/cxgb3i.h new file mode 100644 index 00000000000..fde6e4c634e --- /dev/null +++ b/drivers/scsi/cxgb3i/cxgb3i.h @@ -0,0 +1,139 @@ +/* + * cxgb3i.h: Chelsio S3xx iSCSI driver. + * + * Copyright (c) 2008 Chelsio Communications, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation. + * + * Written by: Karen Xie (kxie@chelsio.com) + */ + +#ifndef __CXGB3I_H__ +#define __CXGB3I_H__ + +#include +#include +#include +#include +#include +#include +#include +#include + +/* from cxgb3 LLD */ +#include "common.h" +#include "t3_cpl.h" +#include "t3cdev.h" +#include "cxgb3_ctl_defs.h" +#include "cxgb3_offload.h" +#include "firmware_exports.h" + +#include "cxgb3i_offload.h" +#include "cxgb3i_ddp.h" + +#define CXGB3I_SCSI_QDEPTH_DFLT 128 +#define CXGB3I_MAX_TARGET CXGB3I_MAX_CONN +#define CXGB3I_MAX_LUN 512 +#define ISCSI_PDU_NONPAYLOAD_MAX \ + (sizeof(struct iscsi_hdr) + ISCSI_MAX_AHS_SIZE + 2*ISCSI_DIGEST_SIZE) + +struct cxgb3i_adapter; +struct cxgb3i_hba; +struct cxgb3i_endpoint; + +/** + * struct cxgb3i_hba - cxgb3i iscsi structure (per port) + * + * @snic: cxgb3i adapter containing this port + * @ndev: pointer to netdev structure + * @shost: pointer to scsi host structure + */ +struct cxgb3i_hba { + struct cxgb3i_adapter *snic; + struct net_device *ndev; + struct Scsi_Host *shost; +}; + +/** + * struct cxgb3i_adapter - cxgb3i adapter structure (per pci) + * + * @listhead: list head to link elements + * @lock: lock for this structure + * @tdev: pointer to t3cdev used by cxgb3 driver + * @pdev: pointer to pci dev + * @hba_cnt: # of hbas (the same as # of ports) + * @hba: all the hbas on this adapter + * @tx_max_size: max. tx packet size supported + * @rx_max_size: max. rx packet size supported + * @tag_format: ddp tag format settings + */ +struct cxgb3i_adapter { + struct list_head list_head; + spinlock_t lock; + struct t3cdev *tdev; + struct pci_dev *pdev; + unsigned char hba_cnt; + struct cxgb3i_hba *hba[MAX_NPORTS]; + + unsigned int tx_max_size; + unsigned int rx_max_size; + + struct cxgb3i_tag_format tag_format; +}; + +/** + * struct cxgb3i_conn - cxgb3i iscsi connection + * + * @listhead: list head to link elements + * @cep: pointer to iscsi_endpoint structure + * @conn: pointer to iscsi_conn structure + * @hba: pointer to the hba this conn. is going through + * @task_idx_bits: # of bits needed for session->cmds_max + */ +struct cxgb3i_conn { + struct list_head list_head; + struct cxgb3i_endpoint *cep; + struct iscsi_conn *conn; + struct cxgb3i_hba *hba; + unsigned int task_idx_bits; +}; + +/** + * struct cxgb3i_endpoint - iscsi tcp endpoint + * + * @c3cn: the h/w tcp connection representation + * @hba: pointer to the hba this conn. is going through + * @cconn: pointer to the associated cxgb3i iscsi connection + */ +struct cxgb3i_endpoint { + struct s3_conn *c3cn; + struct cxgb3i_hba *hba; + struct cxgb3i_conn *cconn; +}; + +int cxgb3i_iscsi_init(void); +void cxgb3i_iscsi_cleanup(void); + +struct cxgb3i_adapter *cxgb3i_adapter_add(struct t3cdev *); +void cxgb3i_adapter_remove(struct t3cdev *); +int cxgb3i_adapter_ulp_init(struct cxgb3i_adapter *); +void cxgb3i_adapter_ulp_cleanup(struct cxgb3i_adapter *); + +struct cxgb3i_hba *cxgb3i_hba_find_by_netdev(struct net_device *); +struct cxgb3i_hba *cxgb3i_hba_host_add(struct cxgb3i_adapter *, + struct net_device *); +void cxgb3i_hba_host_remove(struct cxgb3i_hba *); + +int cxgb3i_pdu_init(void); +void cxgb3i_pdu_cleanup(void); +void cxgb3i_conn_cleanup_task(struct iscsi_task *); +int cxgb3i_conn_alloc_pdu(struct iscsi_task *, u8); +int cxgb3i_conn_init_pdu(struct iscsi_task *, unsigned int, unsigned int); +int cxgb3i_conn_xmit_pdu(struct iscsi_task *); + +void cxgb3i_release_itt(struct iscsi_task *task, itt_t hdr_itt); +int cxgb3i_reserve_itt(struct iscsi_task *task, itt_t *hdr_itt); + +#endif diff --git a/drivers/scsi/cxgb3i/cxgb3i_ddp.c b/drivers/scsi/cxgb3i/cxgb3i_ddp.c new file mode 100644 index 00000000000..1a41f04264f --- /dev/null +++ b/drivers/scsi/cxgb3i/cxgb3i_ddp.c @@ -0,0 +1,770 @@ +/* + * cxgb3i_ddp.c: Chelsio S3xx iSCSI DDP Manager. + * + * Copyright (c) 2008 Chelsio Communications, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation. + * + * Written by: Karen Xie (kxie@chelsio.com) + */ + +#include + +/* from cxgb3 LLD */ +#include "common.h" +#include "t3_cpl.h" +#include "t3cdev.h" +#include "cxgb3_ctl_defs.h" +#include "cxgb3_offload.h" +#include "firmware_exports.h" + +#include "cxgb3i_ddp.h" + +#define DRV_MODULE_NAME "cxgb3i_ddp" +#define DRV_MODULE_VERSION "1.0.0" +#define DRV_MODULE_RELDATE "Dec. 1, 2008" + +static char version[] = + "Chelsio S3xx iSCSI DDP " DRV_MODULE_NAME + " v" DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")\n"; + +MODULE_AUTHOR("Karen Xie "); +MODULE_DESCRIPTION("cxgb3i ddp pagepod manager"); +MODULE_LICENSE("GPL"); +MODULE_VERSION(DRV_MODULE_VERSION); + +#define ddp_log_error(fmt...) printk(KERN_ERR "cxgb3i_ddp: ERR! " fmt) +#define ddp_log_warn(fmt...) printk(KERN_WARNING "cxgb3i_ddp: WARN! " fmt) +#define ddp_log_info(fmt...) printk(KERN_INFO "cxgb3i_ddp: " fmt) + +#ifdef __DEBUG_CXGB3I_DDP__ +#define ddp_log_debug(fmt, args...) \ + printk(KERN_INFO "cxgb3i_ddp: %s - " fmt, __func__ , ## args) +#else +#define ddp_log_debug(fmt...) +#endif + +/* + * iSCSI Direct Data Placement + * + * T3 h/w can directly place the iSCSI Data-In or Data-Out PDU's payload into + * pre-posted final destination host-memory buffers based on the Initiator + * Task Tag (ITT) in Data-In or Target Task Tag (TTT) in Data-Out PDUs. + * + * The host memory address is programmed into h/w in the format of pagepod + * entries. + * The location of the pagepod entry is encoded into ddp tag which is used or + * is the base for ITT/TTT. + */ + +#define DDP_PGIDX_MAX 4 +#define DDP_THRESHOLD 2048 +static unsigned char ddp_page_order[DDP_PGIDX_MAX] = {0, 1, 2, 4}; +static unsigned char ddp_page_shift[DDP_PGIDX_MAX] = {12, 13, 14, 16}; +static unsigned char page_idx = DDP_PGIDX_MAX; + +static LIST_HEAD(cxgb3i_ddp_list); +static DEFINE_RWLOCK(cxgb3i_ddp_rwlock); + +/* + * functions to program the pagepod in h/w + */ +static inline void ulp_mem_io_set_hdr(struct sk_buff *skb, unsigned int addr) +{ + struct ulp_mem_io *req = (struct ulp_mem_io *)skb->head; + + req->wr.wr_lo = 0; + req->wr.wr_hi = htonl(V_WR_OP(FW_WROPCODE_BYPASS)); + req->cmd_lock_addr = htonl(V_ULP_MEMIO_ADDR(addr >> 5) | + V_ULPTX_CMD(ULP_MEM_WRITE)); + req->len = htonl(V_ULP_MEMIO_DATA_LEN(PPOD_SIZE >> 5) | + V_ULPTX_NFLITS((PPOD_SIZE >> 3) + 1)); +} + +static int set_ddp_map(struct cxgb3i_ddp_info *ddp, struct pagepod_hdr *hdr, + unsigned int idx, unsigned int npods, + struct cxgb3i_gather_list *gl) +{ + unsigned int pm_addr = (idx << PPOD_SIZE_SHIFT) + ddp->llimit; + int i; + + for (i = 0; i < npods; i++, idx++, pm_addr += PPOD_SIZE) { + struct sk_buff *skb = ddp->gl_skb[idx]; + struct pagepod *ppod; + int j, pidx; + + /* hold on to the skb until we clear the ddp mapping */ + skb_get(skb); + + ulp_mem_io_set_hdr(skb, pm_addr); + ppod = (struct pagepod *) + (skb->head + sizeof(struct ulp_mem_io)); + memcpy(&(ppod->hdr), hdr, sizeof(struct pagepod)); + for (pidx = 4 * i, j = 0; j < 5; ++j, ++pidx) + ppod->addr[j] = pidx < gl->nelem ? + cpu_to_be64(gl->phys_addr[pidx]) : 0UL; + + skb->priority = CPL_PRIORITY_CONTROL; + cxgb3_ofld_send(ddp->tdev, skb); + } + return 0; +} + +static int clear_ddp_map(struct cxgb3i_ddp_info *ddp, unsigned int idx, + unsigned int npods) +{ + unsigned int pm_addr = (idx << PPOD_SIZE_SHIFT) + ddp->llimit; + int i; + + for (i = 0; i < npods; i++, idx++, pm_addr += PPOD_SIZE) { + struct sk_buff *skb = ddp->gl_skb[idx]; + + ddp->gl_skb[idx] = NULL; + memset((skb->head + sizeof(struct ulp_mem_io)), 0, PPOD_SIZE); + ulp_mem_io_set_hdr(skb, pm_addr); + skb->priority = CPL_PRIORITY_CONTROL; + cxgb3_ofld_send(ddp->tdev, skb); + } + return 0; +} + +static inline int ddp_find_unused_entries(struct cxgb3i_ddp_info *ddp, + int start, int max, int count, + struct cxgb3i_gather_list *gl) +{ + unsigned int i, j; + + spin_lock(&ddp->map_lock); + for (i = start; i <= max;) { + for (j = 0; j < count; j++) { + if (ddp->gl_map[i + j]) + break; + } + if (j == count) { + for (j = 0; j < count; j++) + ddp->gl_map[i + j] = gl; + spin_unlock(&ddp->map_lock); + return i; + } + i += j + 1; + } + spin_unlock(&ddp->map_lock); + return -EBUSY; +} + +static inline void ddp_unmark_entries(struct cxgb3i_ddp_info *ddp, + int start, int count) +{ + spin_lock(&ddp->map_lock); + memset(&ddp->gl_map[start], 0, + count * sizeof(struct cxgb3i_gather_list *)); + spin_unlock(&ddp->map_lock); +} + +static inline void ddp_free_gl_skb(struct cxgb3i_ddp_info *ddp, + int idx, int count) +{ + int i; + + for (i = 0; i < count; i++, idx++) + if (ddp->gl_skb[idx]) { + kfree_skb(ddp->gl_skb[idx]); + ddp->gl_skb[idx] = NULL; + } +} + +static inline int ddp_alloc_gl_skb(struct cxgb3i_ddp_info *ddp, int idx, + int count, gfp_t gfp) +{ + int i; + + for (i = 0; i < count; i++) { + struct sk_buff *skb = alloc_skb(sizeof(struct ulp_mem_io) + + PPOD_SIZE, gfp); + if (skb) { + ddp->gl_skb[idx + i] = skb; + skb_put(skb, sizeof(struct ulp_mem_io) + PPOD_SIZE); + } else { + ddp_free_gl_skb(ddp, idx, i); + return -ENOMEM; + } + } + return 0; +} + +/** + * cxgb3i_ddp_find_page_index - return ddp page index for a given page size. + * @pgsz: page size + * return the ddp page index, if no match is found return DDP_PGIDX_MAX. + */ +int cxgb3i_ddp_find_page_index(unsigned long pgsz) +{ + int i; + + for (i = 0; i < DDP_PGIDX_MAX; i++) { + if (pgsz == (1UL << ddp_page_shift[i])) + return i; + } + ddp_log_debug("ddp page size 0x%lx not supported.\n", pgsz); + return DDP_PGIDX_MAX; +} +EXPORT_SYMBOL_GPL(cxgb3i_ddp_find_page_index); + +static inline void ddp_gl_unmap(struct pci_dev *pdev, + struct cxgb3i_gather_list *gl) +{ + int i; + + for (i = 0; i < gl->nelem; i++) + pci_unmap_page(pdev, gl->phys_addr[i], PAGE_SIZE, + PCI_DMA_FROMDEVICE); +} + +static inline int ddp_gl_map(struct pci_dev *pdev, + struct cxgb3i_gather_list *gl) +{ + int i; + + for (i = 0; i < gl->nelem; i++) { + gl->phys_addr[i] = pci_map_page(pdev, gl->pages[i], 0, + PAGE_SIZE, + PCI_DMA_FROMDEVICE); + if (unlikely(pci_dma_mapping_error(pdev, gl->phys_addr[i]))) + goto unmap; + } + + return i; + +unmap: + if (i) { + unsigned int nelem = gl->nelem; + + gl->nelem = i; + ddp_gl_unmap(pdev, gl); + gl->nelem = nelem; + } + return -ENOMEM; +} + +/** + * cxgb3i_ddp_make_gl - build ddp page buffer list + * @xferlen: total buffer length + * @sgl: page buffer scatter-gather list + * @sgcnt: # of page buffers + * @pdev: pci_dev, used for pci map + * @gfp: allocation mode + * + * construct a ddp page buffer list from the scsi scattergather list. + * coalesce buffers as much as possible, and obtain dma addresses for + * each page. + * + * Return the cxgb3i_gather_list constructed from the page buffers if the + * memory can be used for ddp. Return NULL otherwise. + */ +struct cxgb3i_gather_list *cxgb3i_ddp_make_gl(unsigned int xferlen, + struct scatterlist *sgl, + unsigned int sgcnt, + struct pci_dev *pdev, + gfp_t gfp) +{ + struct cxgb3i_gather_list *gl; + struct scatterlist *sg = sgl; + struct page *sgpage = sg_page(sg); + unsigned int sglen = sg->length; + unsigned int sgoffset = sg->offset; + unsigned int npages = (xferlen + sgoffset + PAGE_SIZE - 1) >> + PAGE_SHIFT; + int i = 1, j = 0; + + if (xferlen < DDP_THRESHOLD) { + ddp_log_debug("xfer %u < threshold %u, no ddp.\n", + xferlen, DDP_THRESHOLD); + return NULL; + } + + gl = kzalloc(sizeof(struct cxgb3i_gather_list) + + npages * (sizeof(dma_addr_t) + sizeof(struct page *)), + gfp); + if (!gl) + return NULL; + + gl->pages = (struct page **)&gl->phys_addr[npages]; + gl->length = xferlen; + gl->offset = sgoffset; + gl->pages[0] = sgpage; + + sg = sg_next(sg); + while (sg) { + struct page *page = sg_page(sg); + + if (sgpage == page && sg->offset == sgoffset + sglen) + sglen += sg->length; + else { + /* make sure the sgl is fit for ddp: + * each has the same page size, and + * all of the middle pages are used completely + */ + if ((j && sgoffset) || + ((i != sgcnt - 1) && + ((sglen + sgoffset) & ~PAGE_MASK))) + goto error_out; + + j++; + if (j == gl->nelem || sg->offset) + goto error_out; + gl->pages[j] = page; + sglen = sg->length; + sgoffset = sg->offset; + sgpage = page; + } + i++; + sg = sg_next(sg); + } + gl->nelem = ++j; + + if (ddp_gl_map(pdev, gl) < 0) + goto error_out; + + return gl; + +error_out: + kfree(gl); + return NULL; +} +EXPORT_SYMBOL_GPL(cxgb3i_ddp_make_gl); + +/** + * cxgb3i_ddp_release_gl - release a page buffer list + * @gl: a ddp page buffer list + * @pdev: pci_dev used for pci_unmap + * free a ddp page buffer list resulted from cxgb3i_ddp_make_gl(). + */ +void cxgb3i_ddp_release_gl(struct cxgb3i_gather_list *gl, + struct pci_dev *pdev) +{ + ddp_gl_unmap(pdev, gl); + kfree(gl); +} +EXPORT_SYMBOL_GPL(cxgb3i_ddp_release_gl); + +/** + * cxgb3i_ddp_tag_reserve - set up ddp for a data transfer + * @tdev: t3cdev adapter + * @tid: connection id + * @tformat: tag format + * @tagp: the s/w tag, if ddp setup is successful, it will be updated with + * ddp/hw tag + * @gl: the page momory list + * @gfp: allocation mode + * + * ddp setup for a given page buffer list and construct the ddp tag. + * return 0 if success, < 0 otherwise. + */ +int cxgb3i_ddp_tag_reserve(struct t3cdev *tdev, unsigned int tid, + struct cxgb3i_tag_format *tformat, u32 *tagp, + struct cxgb3i_gather_list *gl, gfp_t gfp) +{ + struct cxgb3i_ddp_info *ddp = tdev->ulp_iscsi; + struct pagepod_hdr hdr; + unsigned int npods; + int idx = -1, idx_max; + int err = -ENOMEM; + u32 sw_tag = *tagp; + u32 tag; + + if (page_idx >= DDP_PGIDX_MAX || !ddp || !gl || !gl->nelem || + gl->length < DDP_THRESHOLD) { + ddp_log_debug("pgidx %u, xfer %u/%u, NO ddp.\n", + page_idx, gl->length, DDP_THRESHOLD); + return -EINVAL; + } + + npods = (gl->nelem + PPOD_PAGES_MAX - 1) >> PPOD_PAGES_SHIFT; + idx_max = ddp->nppods - npods + 1; + + if (ddp->idx_last == ddp->nppods) + idx = ddp_find_unused_entries(ddp, 0, idx_max, npods, gl); + else { + idx = ddp_find_unused_entries(ddp, ddp->idx_last + 1, + idx_max, npods, gl); + if (idx < 0 && ddp->idx_last >= npods) + idx = ddp_find_unused_entries(ddp, 0, + ddp->idx_last - npods + 1, + npods, gl); + } + if (idx < 0) { + ddp_log_debug("xferlen %u, gl %u, npods %u NO DDP.\n", + gl->length, gl->nelem, npods); + return idx; + } + + err = ddp_alloc_gl_skb(ddp, idx, npods, gfp); + if (err < 0) + goto unmark_entries; + + tag = cxgb3i_ddp_tag_base(tformat, sw_tag); + tag |= idx << PPOD_IDX_SHIFT; + + hdr.rsvd = 0; + hdr.vld_tid = htonl(F_PPOD_VALID | V_PPOD_TID(tid)); + hdr.pgsz_tag_clr = htonl(tag & ddp->rsvd_tag_mask); + hdr.maxoffset = htonl(gl->length); + hdr.pgoffset = htonl(gl->offset); + + err = set_ddp_map(ddp, &hdr, idx, npods, gl); + if (err < 0) + goto free_gl_skb; + + ddp->idx_last = idx; + ddp_log_debug("xfer %u, gl %u,%u, tid 0x%x, 0x%x -> 0x%x(%u,%u).\n", + gl->length, gl->nelem, gl->offset, tid, sw_tag, tag, + idx, npods); + *tagp = tag; + return 0; + +free_gl_skb: + ddp_free_gl_skb(ddp, idx, npods); +unmark_entries: + ddp_unmark_entries(ddp, idx, npods); + return err; +} +EXPORT_SYMBOL_GPL(cxgb3i_ddp_tag_reserve); + +/** + * cxgb3i_ddp_tag_release - release a ddp tag + * @tdev: t3cdev adapter + * @tag: ddp tag + * ddp cleanup for a given ddp tag and release all the resources held + */ +void cxgb3i_ddp_tag_release(struct t3cdev *tdev, u32 tag) +{ + struct cxgb3i_ddp_info *ddp = tdev->ulp_iscsi; + u32 idx; + + if (!ddp) { + ddp_log_error("release ddp tag 0x%x, ddp NULL.\n", tag); + return; + } + + idx = (tag >> PPOD_IDX_SHIFT) & ddp->idx_mask; + if (idx < ddp->nppods) { + struct cxgb3i_gather_list *gl = ddp->gl_map[idx]; + unsigned int npods; + + if (!gl) { + ddp_log_error("release ddp 0x%x, idx 0x%x, gl NULL.\n", + tag, idx); + return; + } + npods = (gl->nelem + PPOD_PAGES_MAX - 1) >> PPOD_PAGES_SHIFT; + ddp_log_debug("ddp tag 0x%x, release idx 0x%x, npods %u.\n", + tag, idx, npods); + clear_ddp_map(ddp, idx, npods); + ddp_unmark_entries(ddp, idx, npods); + cxgb3i_ddp_release_gl(gl, ddp->pdev); + } else + ddp_log_error("ddp tag 0x%x, idx 0x%x > max 0x%x.\n", + tag, idx, ddp->nppods); +} +EXPORT_SYMBOL_GPL(cxgb3i_ddp_tag_release); + +static int setup_conn_pgidx(struct t3cdev *tdev, unsigned int tid, int pg_idx, + int reply) +{ + struct sk_buff *skb = alloc_skb(sizeof(struct cpl_set_tcb_field), + GFP_KERNEL); + struct cpl_set_tcb_field *req; + u64 val = pg_idx < DDP_PGIDX_MAX ? pg_idx : 0; + + if (!skb) + return -ENOMEM; + + /* set up ulp submode and page size */ + req = (struct cpl_set_tcb_field *)skb_put(skb, sizeof(*req)); + req->wr.wr_hi = htonl(V_WR_OP(FW_WROPCODE_FORWARD)); + OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_SET_TCB_FIELD, tid)); + req->reply = V_NO_REPLY(reply ? 0 : 1); + req->cpu_idx = 0; + req->word = htons(31); + req->mask = cpu_to_be64(0xF0000000); + req->val = cpu_to_be64(val << 28); + skb->priority = CPL_PRIORITY_CONTROL; + + cxgb3_ofld_send(tdev, skb); + return 0; +} + +/** + * cxgb3i_setup_conn_host_pagesize - setup the conn.'s ddp page size + * @tdev: t3cdev adapter + * @tid: connection id + * @reply: request reply from h/w + * set up the ddp page size based on the host PAGE_SIZE for a connection + * identified by tid + */ +int cxgb3i_setup_conn_host_pagesize(struct t3cdev *tdev, unsigned int tid, + int reply) +{ + return setup_conn_pgidx(tdev, tid, page_idx, reply); +} +EXPORT_SYMBOL_GPL(cxgb3i_setup_conn_host_pagesize); + +/** + * cxgb3i_setup_conn_pagesize - setup the conn.'s ddp page size + * @tdev: t3cdev adapter + * @tid: connection id + * @reply: request reply from h/w + * @pgsz: ddp page size + * set up the ddp page size for a connection identified by tid + */ +int cxgb3i_setup_conn_pagesize(struct t3cdev *tdev, unsigned int tid, + int reply, unsigned long pgsz) +{ + int pgidx = cxgb3i_ddp_find_page_index(pgsz); + + return setup_conn_pgidx(tdev, tid, pgidx, reply); +} +EXPORT_SYMBOL_GPL(cxgb3i_setup_conn_pagesize); + +/** + * cxgb3i_setup_conn_digest - setup conn. digest setting + * @tdev: t3cdev adapter + * @tid: connection id + * @hcrc: header digest enabled + * @dcrc: data digest enabled + * @reply: request reply from h/w + * set up the iscsi digest settings for a connection identified by tid + */ +int cxgb3i_setup_conn_digest(struct t3cdev *tdev, unsigned int tid, + int hcrc, int dcrc, int reply) +{ + struct sk_buff *skb = alloc_skb(sizeof(struct cpl_set_tcb_field), + GFP_KERNEL); + struct cpl_set_tcb_field *req; + u64 val = (hcrc ? 1 : 0) | (dcrc ? 2 : 0); + + if (!skb) + return -ENOMEM; + + /* set up ulp submode and page size */ + req = (struct cpl_set_tcb_field *)skb_put(skb, sizeof(*req)); + req->wr.wr_hi = htonl(V_WR_OP(FW_WROPCODE_FORWARD)); + OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_SET_TCB_FIELD, tid)); + req->reply = V_NO_REPLY(reply ? 0 : 1); + req->cpu_idx = 0; + req->word = htons(31); + req->mask = cpu_to_be64(0x0F000000); + req->val = cpu_to_be64(val << 24); + skb->priority = CPL_PRIORITY_CONTROL; + + cxgb3_ofld_send(tdev, skb); + return 0; +} +EXPORT_SYMBOL_GPL(cxgb3i_setup_conn_digest); + +static int ddp_init(struct t3cdev *tdev) +{ + struct cxgb3i_ddp_info *ddp; + struct ulp_iscsi_info uinfo; + unsigned int ppmax, bits; + int i, err; + static int vers_printed; + + if (!vers_printed) { + printk(KERN_INFO "%s", version); + vers_printed = 1; + } + + err = tdev->ctl(tdev, ULP_ISCSI_GET_PARAMS, &uinfo); + if (err < 0) { + ddp_log_error("%s, failed to get iscsi param err=%d.\n", + tdev->name, err); + return err; + } + + ppmax = (uinfo.ulimit - uinfo.llimit + 1) >> PPOD_SIZE_SHIFT; + bits = __ilog2_u32(ppmax) + 1; + if (bits > PPOD_IDX_MAX_SIZE) + bits = PPOD_IDX_MAX_SIZE; + ppmax = (1 << (bits - 1)) - 1; + + ddp = cxgb3i_alloc_big_mem(sizeof(struct cxgb3i_ddp_info) + + ppmax * + (sizeof(struct cxgb3i_gather_list *) + + sizeof(struct sk_buff *)), + GFP_KERNEL); + if (!ddp) { + ddp_log_warn("%s unable to alloc ddp 0x%d, ddp disabled.\n", + tdev->name, ppmax); + return 0; + } + ddp->gl_map = (struct cxgb3i_gather_list **)(ddp + 1); + ddp->gl_skb = (struct sk_buff **)(((char *)ddp->gl_map) + + ppmax * + sizeof(struct cxgb3i_gather_list *)); + spin_lock_init(&ddp->map_lock); + + ddp->tdev = tdev; + ddp->pdev = uinfo.pdev; + ddp->max_txsz = min_t(unsigned int, uinfo.max_txsz, ULP2_MAX_PKT_SIZE); + ddp->max_rxsz = min_t(unsigned int, uinfo.max_rxsz, ULP2_MAX_PKT_SIZE); + ddp->llimit = uinfo.llimit; + ddp->ulimit = uinfo.ulimit; + ddp->nppods = ppmax; + ddp->idx_last = ppmax; + ddp->idx_bits = bits; + ddp->idx_mask = (1 << bits) - 1; + ddp->rsvd_tag_mask = (1 << (bits + PPOD_IDX_SHIFT)) - 1; + + uinfo.tagmask = ddp->idx_mask << PPOD_IDX_SHIFT; + for (i = 0; i < DDP_PGIDX_MAX; i++) + uinfo.pgsz_factor[i] = ddp_page_order[i]; + uinfo.ulimit = uinfo.llimit + (ppmax << PPOD_SIZE_SHIFT); + + err = tdev->ctl(tdev, ULP_ISCSI_SET_PARAMS, &uinfo); + if (err < 0) { + ddp_log_warn("%s unable to set iscsi param err=%d, " + "ddp disabled.\n", tdev->name, err); + goto free_ddp_map; + } + + tdev->ulp_iscsi = ddp; + + /* add to the list */ + write_lock(&cxgb3i_ddp_rwlock); + list_add_tail(&ddp->list, &cxgb3i_ddp_list); + write_unlock(&cxgb3i_ddp_rwlock); + + ddp_log_info("nppods %u (0x%x ~ 0x%x), bits %u, mask 0x%x,0x%x " + "pkt %u,%u.\n", + ppmax, ddp->llimit, ddp->ulimit, ddp->idx_bits, + ddp->idx_mask, ddp->rsvd_tag_mask, + ddp->max_txsz, ddp->max_rxsz); + return 0; + +free_ddp_map: + cxgb3i_free_big_mem(ddp); + return err; +} + +/** + * cxgb3i_adapter_ddp_init - initialize the adapter's ddp resource + * @tdev: t3cdev adapter + * @tformat: tag format + * @txsz: max tx pkt size, filled in by this func. + * @rxsz: max rx pkt size, filled in by this func. + * initialize the ddp pagepod manager for a given adapter if needed and + * setup the tag format for a given iscsi entity + */ +int cxgb3i_adapter_ddp_init(struct t3cdev *tdev, + struct cxgb3i_tag_format *tformat, + unsigned int *txsz, unsigned int *rxsz) +{ + struct cxgb3i_ddp_info *ddp; + unsigned char idx_bits; + + if (!tformat) + return -EINVAL; + + if (!tdev->ulp_iscsi) { + int err = ddp_init(tdev); + if (err < 0) + return err; + } + ddp = (struct cxgb3i_ddp_info *)tdev->ulp_iscsi; + + idx_bits = 32 - tformat->sw_bits; + tformat->rsvd_bits = ddp->idx_bits; + tformat->rsvd_shift = PPOD_IDX_SHIFT; + tformat->rsvd_mask = (1 << tformat->rsvd_bits) - 1; + + ddp_log_info("tag format: sw %u, rsvd %u,%u, mask 0x%x.\n", + tformat->sw_bits, tformat->rsvd_bits, + tformat->rsvd_shift, tformat->rsvd_mask); + + *txsz = ddp->max_txsz; + *rxsz = ddp->max_rxsz; + ddp_log_info("ddp max pkt size: %u, %u.\n", + ddp->max_txsz, ddp->max_rxsz); + return 0; +} +EXPORT_SYMBOL_GPL(cxgb3i_adapter_ddp_init); + +static void ddp_release(struct cxgb3i_ddp_info *ddp) +{ + int i = 0; + struct t3cdev *tdev = ddp->tdev; + + tdev->ulp_iscsi = NULL; + while (i < ddp->nppods) { + struct cxgb3i_gather_list *gl = ddp->gl_map[i]; + if (gl) { + int npods = (gl->nelem + PPOD_PAGES_MAX - 1) + >> PPOD_PAGES_SHIFT; + + kfree(gl); + ddp_free_gl_skb(ddp, i, npods); + } else + i++; + } + cxgb3i_free_big_mem(ddp); +} + +/** + * cxgb3i_adapter_ddp_cleanup - release the adapter's ddp resource + * @tdev: t3cdev adapter + * release all the resource held by the ddp pagepod manager for a given + * adapter if needed + */ +void cxgb3i_adapter_ddp_cleanup(struct t3cdev *tdev) +{ + struct cxgb3i_ddp_info *ddp; + + /* remove from the list */ + write_lock(&cxgb3i_ddp_rwlock); + list_for_each_entry(ddp, &cxgb3i_ddp_list, list) { + if (ddp->tdev == tdev) { + list_del(&ddp->list); + break; + } + } + write_unlock(&cxgb3i_ddp_rwlock); + + if (ddp) + ddp_release(ddp); +} +EXPORT_SYMBOL_GPL(cxgb3i_adapter_ddp_cleanup); + +/** + * cxgb3i_ddp_init_module - module init entry point + * initialize any driver wide global data structures + */ +static int __init cxgb3i_ddp_init_module(void) +{ + page_idx = cxgb3i_ddp_find_page_index(PAGE_SIZE); + ddp_log_info("system PAGE_SIZE %lu, ddp idx %u.\n", + PAGE_SIZE, page_idx); + return 0; +} + +/** + * cxgb3i_ddp_exit_module - module cleanup/exit entry point + * go through the ddp list and release any resource held. + */ +static void __exit cxgb3i_ddp_exit_module(void) +{ + struct cxgb3i_ddp_info *ddp; + + /* release all ddp manager if there is any */ + write_lock(&cxgb3i_ddp_rwlock); + list_for_each_entry(ddp, &cxgb3i_ddp_list, list) { + list_del(&ddp->list); + ddp_release(ddp); + } + write_unlock(&cxgb3i_ddp_rwlock); +} + +module_init(cxgb3i_ddp_init_module); +module_exit(cxgb3i_ddp_exit_module); diff --git a/drivers/scsi/cxgb3i/cxgb3i_ddp.h b/drivers/scsi/cxgb3i/cxgb3i_ddp.h new file mode 100644 index 00000000000..5c7c4d95c49 --- /dev/null +++ b/drivers/scsi/cxgb3i/cxgb3i_ddp.h @@ -0,0 +1,306 @@ +/* + * cxgb3i_ddp.h: Chelsio S3xx iSCSI DDP Manager. + * + * Copyright (c) 2008 Chelsio Communications, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation. + * + * Written by: Karen Xie (kxie@chelsio.com) + */ + +#ifndef __CXGB3I_ULP2_DDP_H__ +#define __CXGB3I_ULP2_DDP_H__ + +/** + * struct cxgb3i_tag_format - cxgb3i ulp tag format for an iscsi entity + * + * @sw_bits: # of bits used by iscsi software layer + * @rsvd_bits: # of bits used by h/w + * @rsvd_shift: h/w bits shift left + * @rsvd_mask: reserved bit mask + */ +struct cxgb3i_tag_format { + unsigned char sw_bits; + unsigned char rsvd_bits; + unsigned char rsvd_shift; + unsigned char filler[1]; + u32 rsvd_mask; +}; + +/** + * struct cxgb3i_gather_list - cxgb3i direct data placement memory + * + * @tag: ddp tag + * @length: total data buffer length + * @offset: initial offset to the 1st page + * @nelem: # of pages + * @pages: page pointers + * @phys_addr: physical address + */ +struct cxgb3i_gather_list { + u32 tag; + unsigned int length; + unsigned int offset; + unsigned int nelem; + struct page **pages; + dma_addr_t phys_addr[0]; +}; + +/** + * struct cxgb3i_ddp_info - cxgb3i direct data placement for pdu payload + * + * @list: list head to link elements + * @tdev: pointer to t3cdev used by cxgb3 driver + * @max_txsz: max tx packet size for ddp + * @max_rxsz: max rx packet size for ddp + * @llimit: lower bound of the page pod memory + * @ulimit: upper bound of the page pod memory + * @nppods: # of page pod entries + * @idx_last: page pod entry last used + * @idx_bits: # of bits the pagepod index would take + * @idx_mask: pagepod index mask + * @rsvd_tag_mask: tag mask + * @map_lock: lock to synchonize access to the page pod map + * @gl_map: ddp memory gather list + * @gl_skb: skb used to program the pagepod + */ +struct cxgb3i_ddp_info { + struct list_head list; + struct t3cdev *tdev; + struct pci_dev *pdev; + unsigned int max_txsz; + unsigned int max_rxsz; + unsigned int llimit; + unsigned int ulimit; + unsigned int nppods; + unsigned int idx_last; + unsigned char idx_bits; + unsigned char filler[3]; + u32 idx_mask; + u32 rsvd_tag_mask; + spinlock_t map_lock; + struct cxgb3i_gather_list **gl_map; + struct sk_buff **gl_skb; +}; + +#define ULP2_MAX_PKT_SIZE 16224 +#define ULP2_MAX_PDU_PAYLOAD (ULP2_MAX_PKT_SIZE - ISCSI_PDU_NONPAYLOAD_MAX) +#define PPOD_PAGES_MAX 4 +#define PPOD_PAGES_SHIFT 2 /* 4 pages per pod */ + +/* + * struct pagepod_hdr, pagepod - pagepod format + */ +struct pagepod_hdr { + u32 vld_tid; + u32 pgsz_tag_clr; + u32 maxoffset; + u32 pgoffset; + u64 rsvd; +}; + +struct pagepod { + struct pagepod_hdr hdr; + u64 addr[PPOD_PAGES_MAX + 1]; +}; + +#define PPOD_SIZE sizeof(struct pagepod) /* 64 */ +#define PPOD_SIZE_SHIFT 6 + +#define PPOD_COLOR_SHIFT 0 +#define PPOD_COLOR_SIZE 6 +#define PPOD_COLOR_MASK ((1 << PPOD_COLOR_SIZE) - 1) + +#define PPOD_IDX_SHIFT PPOD_COLOR_SIZE +#define PPOD_IDX_MAX_SIZE 24 + +#define S_PPOD_TID 0 +#define M_PPOD_TID 0xFFFFFF +#define V_PPOD_TID(x) ((x) << S_PPOD_TID) + +#define S_PPOD_VALID 24 +#define V_PPOD_VALID(x) ((x) << S_PPOD_VALID) +#define F_PPOD_VALID V_PPOD_VALID(1U) + +#define S_PPOD_COLOR 0 +#define M_PPOD_COLOR 0x3F +#define V_PPOD_COLOR(x) ((x) << S_PPOD_COLOR) + +#define S_PPOD_TAG 6 +#define M_PPOD_TAG 0xFFFFFF +#define V_PPOD_TAG(x) ((x) << S_PPOD_TAG) + +#define S_PPOD_PGSZ 30 +#define M_PPOD_PGSZ 0x3 +#define V_PPOD_PGSZ(x) ((x) << S_PPOD_PGSZ) + +/* + * large memory chunk allocation/release + * use vmalloc() if kmalloc() fails + */ +static inline void *cxgb3i_alloc_big_mem(unsigned int size, + gfp_t gfp) +{ + void *p = kmalloc(size, gfp); + if (!p) + p = vmalloc(size); + if (p) + memset(p, 0, size); + return p; +} + +static inline void cxgb3i_free_big_mem(void *addr) +{ + if (is_vmalloc_addr(addr)) + vfree(addr); + else + kfree(addr); +} + +/* + * cxgb3i ddp tag are 32 bits, it consists of reserved bits used by h/w and + * non-reserved bits that can be used by the iscsi s/w. + * The reserved bits are identified by the rsvd_bits and rsvd_shift fields + * in struct cxgb3i_tag_format. + * + * The upper most reserved bit can be used to check if a tag is ddp tag or not: + * if the bit is 0, the tag is a valid ddp tag + */ + +/** + * cxgb3i_is_ddp_tag - check if a given tag is a hw/ddp tag + * @tformat: tag format information + * @tag: tag to be checked + * + * return true if the tag is a ddp tag, false otherwise. + */ +static inline int cxgb3i_is_ddp_tag(struct cxgb3i_tag_format *tformat, u32 tag) +{ + return !(tag & (1 << (tformat->rsvd_bits + tformat->rsvd_shift - 1))); +} + +/** + * cxgb3i_sw_tag_usable - check if a given s/w tag has enough bits left for + * the reserved/hw bits + * @tformat: tag format information + * @sw_tag: s/w tag to be checked + * + * return true if the tag is a ddp tag, false otherwise. + */ +static inline int cxgb3i_sw_tag_usable(struct cxgb3i_tag_format *tformat, + u32 sw_tag) +{ + sw_tag >>= (32 - tformat->rsvd_bits); + return !sw_tag; +} + +/** + * cxgb3i_set_non_ddp_tag - mark a given s/w tag as an invalid ddp tag + * @tformat: tag format information + * @sw_tag: s/w tag to be checked + * + * insert 1 at the upper most reserved bit to mark it as an invalid ddp tag. + */ +static inline u32 cxgb3i_set_non_ddp_tag(struct cxgb3i_tag_format *tformat, + u32 sw_tag) +{ + unsigned char shift = tformat->rsvd_bits + tformat->rsvd_shift - 1; + u32 mask = (1 << shift) - 1; + + if (sw_tag && (sw_tag & ~mask)) { + u32 v1 = sw_tag & ((1 << shift) - 1); + u32 v2 = (sw_tag >> (shift - 1)) << shift; + + return v2 | v1 | 1 << shift; + } + return sw_tag | 1 << shift; +} + +/** + * cxgb3i_ddp_tag_base - shift the s/w tag bits so that reserved bits are not + * used. + * @tformat: tag format information + * @sw_tag: s/w tag to be checked + */ +static inline u32 cxgb3i_ddp_tag_base(struct cxgb3i_tag_format *tformat, + u32 sw_tag) +{ + u32 mask = (1 << tformat->rsvd_shift) - 1; + + if (sw_tag && (sw_tag & ~mask)) { + u32 v1 = sw_tag & mask; + u32 v2 = sw_tag >> tformat->rsvd_shift; + + v2 <<= tformat->rsvd_shift + tformat->rsvd_bits; + return v2 | v1; + } + return sw_tag; +} + +/** + * cxgb3i_tag_rsvd_bits - get the reserved bits used by the h/w + * @tformat: tag format information + * @tag: tag to be checked + * + * return the reserved bits in the tag + */ +static inline u32 cxgb3i_tag_rsvd_bits(struct cxgb3i_tag_format *tformat, + u32 tag) +{ + if (cxgb3i_is_ddp_tag(tformat, tag)) + return (tag >> tformat->rsvd_shift) & tformat->rsvd_mask; + return 0; +} + +/** + * cxgb3i_tag_nonrsvd_bits - get the non-reserved bits used by the s/w + * @tformat: tag format information + * @tag: tag to be checked + * + * return the non-reserved bits in the tag. + */ +static inline u32 cxgb3i_tag_nonrsvd_bits(struct cxgb3i_tag_format *tformat, + u32 tag) +{ + unsigned char shift = tformat->rsvd_bits + tformat->rsvd_shift - 1; + u32 v1, v2; + + if (cxgb3i_is_ddp_tag(tformat, tag)) { + v1 = tag & ((1 << tformat->rsvd_shift) - 1); + v2 = (tag >> (shift + 1)) << tformat->rsvd_shift; + } else { + u32 mask = (1 << shift) - 1; + + tag &= ~(1 << shift); + v1 = tag & mask; + v2 = (tag >> 1) & ~mask; + } + return v1 | v2; +} + +int cxgb3i_ddp_tag_reserve(struct t3cdev *, unsigned int tid, + struct cxgb3i_tag_format *, u32 *tag, + struct cxgb3i_gather_list *, gfp_t gfp); +void cxgb3i_ddp_tag_release(struct t3cdev *, u32 tag); + +struct cxgb3i_gather_list *cxgb3i_ddp_make_gl(unsigned int xferlen, + struct scatterlist *sgl, + unsigned int sgcnt, + struct pci_dev *pdev, + gfp_t gfp); +void cxgb3i_ddp_release_gl(struct cxgb3i_gather_list *gl, + struct pci_dev *pdev); + +int cxgb3i_setup_conn_host_pagesize(struct t3cdev *, unsigned int tid, + int reply); +int cxgb3i_setup_conn_pagesize(struct t3cdev *, unsigned int tid, int reply, + unsigned long pgsz); +int cxgb3i_setup_conn_digest(struct t3cdev *, unsigned int tid, + int hcrc, int dcrc, int reply); +int cxgb3i_ddp_find_page_index(unsigned long pgsz); +int cxgb3i_adapter_ddp_init(struct t3cdev *, struct cxgb3i_tag_format *, + unsigned int *txsz, unsigned int *rxsz); +void cxgb3i_adapter_ddp_cleanup(struct t3cdev *); +#endif diff --git a/drivers/scsi/cxgb3i/cxgb3i_init.c b/drivers/scsi/cxgb3i/cxgb3i_init.c new file mode 100644 index 00000000000..091ecb4d9f3 --- /dev/null +++ b/drivers/scsi/cxgb3i/cxgb3i_init.c @@ -0,0 +1,107 @@ +/* cxgb3i_init.c: Chelsio S3xx iSCSI driver. + * + * Copyright (c) 2008 Chelsio Communications, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation. + * + * Written by: Karen Xie (kxie@chelsio.com) + */ + +#include "cxgb3i.h" + +#define DRV_MODULE_NAME "cxgb3i" +#define DRV_MODULE_VERSION "1.0.0" +#define DRV_MODULE_RELDATE "Jun. 1, 2008" + +static char version[] = + "Chelsio S3xx iSCSI Driver " DRV_MODULE_NAME + " v" DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")\n"; + +MODULE_AUTHOR("Karen Xie "); +MODULE_DESCRIPTION("Chelsio S3xx iSCSI Driver"); +MODULE_LICENSE("GPL"); +MODULE_VERSION(DRV_MODULE_VERSION); + +static void open_s3_dev(struct t3cdev *); +static void close_s3_dev(struct t3cdev *); + +static cxgb3_cpl_handler_func cxgb3i_cpl_handlers[NUM_CPL_CMDS]; +static struct cxgb3_client t3c_client = { + .name = "iscsi_cxgb3", + .handlers = cxgb3i_cpl_handlers, + .add = open_s3_dev, + .remove = close_s3_dev, +}; + +/** + * open_s3_dev - register with cxgb3 LLD + * @t3dev: cxgb3 adapter instance + */ +static void open_s3_dev(struct t3cdev *t3dev) +{ + static int vers_printed; + + if (!vers_printed) { + printk(KERN_INFO "%s", version); + vers_printed = 1; + } + + cxgb3i_sdev_add(t3dev, &t3c_client); + cxgb3i_adapter_add(t3dev); +} + +/** + * close_s3_dev - de-register with cxgb3 LLD + * @t3dev: cxgb3 adapter instance + */ +static void close_s3_dev(struct t3cdev *t3dev) +{ + cxgb3i_adapter_remove(t3dev); + cxgb3i_sdev_remove(t3dev); +} + +/** + * cxgb3i_init_module - module init entry point + * + * initialize any driver wide global data structures and register itself + * with the cxgb3 module + */ +static int __init cxgb3i_init_module(void) +{ + int err; + + err = cxgb3i_sdev_init(cxgb3i_cpl_handlers); + if (err < 0) + return err; + + err = cxgb3i_iscsi_init(); + if (err < 0) + return err; + + err = cxgb3i_pdu_init(); + if (err < 0) + return err; + + cxgb3_register_client(&t3c_client); + + return 0; +} + +/** + * cxgb3i_exit_module - module cleanup/exit entry point + * + * go through the driver hba list and for each hba, release any resource held. + * and unregisters iscsi transport and the cxgb3 module + */ +static void __exit cxgb3i_exit_module(void) +{ + cxgb3_unregister_client(&t3c_client); + cxgb3i_pdu_cleanup(); + cxgb3i_iscsi_cleanup(); + cxgb3i_sdev_cleanup(); +} + +module_init(cxgb3i_init_module); +module_exit(cxgb3i_exit_module); diff --git a/drivers/scsi/cxgb3i/cxgb3i_iscsi.c b/drivers/scsi/cxgb3i/cxgb3i_iscsi.c new file mode 100644 index 00000000000..d83464b9b3f --- /dev/null +++ b/drivers/scsi/cxgb3i/cxgb3i_iscsi.c @@ -0,0 +1,951 @@ +/* cxgb3i_iscsi.c: Chelsio S3xx iSCSI driver. + * + * Copyright (c) 2008 Chelsio Communications, Inc. + * Copyright (c) 2008 Mike Christie + * Copyright (c) 2008 Red Hat, Inc. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation. + * + * Written by: Karen Xie (kxie@chelsio.com) + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "cxgb3i.h" +#include "cxgb3i_pdu.h" + +#ifdef __DEBUG_CXGB3I_TAG__ +#define cxgb3i_tag_debug cxgb3i_log_debug +#else +#define cxgb3i_tag_debug(fmt...) +#endif + +#ifdef __DEBUG_CXGB3I_API__ +#define cxgb3i_api_debug cxgb3i_log_debug +#else +#define cxgb3i_api_debug(fmt...) +#endif + +/* + * align pdu size to multiple of 512 for better performance + */ +#define align_pdu_size(n) do { n = (n) & (~511); } while (0) + +static struct scsi_transport_template *cxgb3i_scsi_transport; +static struct scsi_host_template cxgb3i_host_template; +static struct iscsi_transport cxgb3i_iscsi_transport; +static unsigned char sw_tag_idx_bits; +static unsigned char sw_tag_age_bits; + +static LIST_HEAD(cxgb3i_snic_list); +static DEFINE_RWLOCK(cxgb3i_snic_rwlock); + +/** + * cxgb3i_adapter_add - init a s3 adapter structure and any h/w settings + * @t3dev: t3cdev adapter + * return the resulting cxgb3i_adapter struct + */ +struct cxgb3i_adapter *cxgb3i_adapter_add(struct t3cdev *t3dev) +{ + struct cxgb3i_adapter *snic; + struct adapter *adapter = tdev2adap(t3dev); + int i; + + snic = kzalloc(sizeof(*snic), GFP_KERNEL); + if (!snic) { + cxgb3i_api_debug("cxgb3 %s, OOM.\n", t3dev->name); + return NULL; + } + spin_lock_init(&snic->lock); + + snic->tdev = t3dev; + snic->pdev = adapter->pdev; + snic->tag_format.sw_bits = sw_tag_idx_bits + sw_tag_age_bits; + + if (cxgb3i_adapter_ddp_init(t3dev, &snic->tag_format, + &snic->tx_max_size, + &snic->rx_max_size) < 0) + goto free_snic; + + for_each_port(adapter, i) { + snic->hba[i] = cxgb3i_hba_host_add(snic, adapter->port[i]); + if (!snic->hba[i]) + goto ulp_cleanup; + } + snic->hba_cnt = adapter->params.nports; + + /* add to the list */ + write_lock(&cxgb3i_snic_rwlock); + list_add_tail(&snic->list_head, &cxgb3i_snic_list); + write_unlock(&cxgb3i_snic_rwlock); + + return snic; + +ulp_cleanup: + cxgb3i_adapter_ddp_cleanup(t3dev); +free_snic: + kfree(snic); + return NULL; +} + +/** + * cxgb3i_adapter_remove - release all the resources held and cleanup any + * h/w settings + * @t3dev: t3cdev adapter + */ +void cxgb3i_adapter_remove(struct t3cdev *t3dev) +{ + int i; + struct cxgb3i_adapter *snic; + + /* remove from the list */ + write_lock(&cxgb3i_snic_rwlock); + list_for_each_entry(snic, &cxgb3i_snic_list, list_head) { + if (snic->tdev == t3dev) { + list_del(&snic->list_head); + break; + } + } + write_unlock(&cxgb3i_snic_rwlock); + + if (snic) { + for (i = 0; i < snic->hba_cnt; i++) { + if (snic->hba[i]) { + cxgb3i_hba_host_remove(snic->hba[i]); + snic->hba[i] = NULL; + } + } + + /* release ddp resources */ + cxgb3i_adapter_ddp_cleanup(snic->tdev); + kfree(snic); + } +} + +/** + * cxgb3i_hba_find_by_netdev - find the cxgb3i_hba structure with a given + * net_device + * @t3dev: t3cdev adapter + */ +struct cxgb3i_hba *cxgb3i_hba_find_by_netdev(struct net_device *ndev) +{ + struct cxgb3i_adapter *snic; + int i; + + read_lock(&cxgb3i_snic_rwlock); + list_for_each_entry(snic, &cxgb3i_snic_list, list_head) { + for (i = 0; i < snic->hba_cnt; i++) { + if (snic->hba[i]->ndev == ndev) { + read_unlock(&cxgb3i_snic_rwlock); + return snic->hba[i]; + } + } + } + read_unlock(&cxgb3i_snic_rwlock); + return NULL; +} + +/** + * cxgb3i_hba_host_add - register a new host with scsi/iscsi + * @snic: the cxgb3i adapter + * @ndev: associated net_device + */ +struct cxgb3i_hba *cxgb3i_hba_host_add(struct cxgb3i_adapter *snic, + struct net_device *ndev) +{ + struct cxgb3i_hba *hba; + struct Scsi_Host *shost; + int err; + + shost = iscsi_host_alloc(&cxgb3i_host_template, + sizeof(struct cxgb3i_hba), + CXGB3I_SCSI_QDEPTH_DFLT); + if (!shost) { + cxgb3i_log_info("iscsi_host_alloc failed.\n"); + return NULL; + } + + shost->transportt = cxgb3i_scsi_transport; + shost->max_lun = CXGB3I_MAX_LUN; + shost->max_id = CXGB3I_MAX_TARGET; + shost->max_channel = 0; + shost->max_cmd_len = 16; + + hba = iscsi_host_priv(shost); + hba->snic = snic; + hba->ndev = ndev; + hba->shost = shost; + + pci_dev_get(snic->pdev); + err = iscsi_host_add(shost, &snic->pdev->dev); + if (err) { + cxgb3i_log_info("iscsi_host_add failed.\n"); + goto pci_dev_put; + } + + cxgb3i_api_debug("shost 0x%p, hba 0x%p, no %u.\n", + shost, hba, shost->host_no); + + return hba; + +pci_dev_put: + pci_dev_put(snic->pdev); + scsi_host_put(shost); + return NULL; +} + +/** + * cxgb3i_hba_host_remove - de-register the host with scsi/iscsi + * @hba: the cxgb3i hba + */ +void cxgb3i_hba_host_remove(struct cxgb3i_hba *hba) +{ + cxgb3i_api_debug("shost 0x%p, hba 0x%p, no %u.\n", + hba->shost, hba, hba->shost->host_no); + iscsi_host_remove(hba->shost); + pci_dev_put(hba->snic->pdev); + iscsi_host_free(hba->shost); +} + +/** + * cxgb3i_ep_connect - establish TCP connection to target portal + * @dst_addr: target IP address + * @non_blocking: blocking or non-blocking call + * + * Initiates a TCP/IP connection to the dst_addr + */ +static struct iscsi_endpoint *cxgb3i_ep_connect(struct sockaddr *dst_addr, + int non_blocking) +{ + struct iscsi_endpoint *ep; + struct cxgb3i_endpoint *cep; + struct cxgb3i_hba *hba; + struct s3_conn *c3cn = NULL; + int err = 0; + + c3cn = cxgb3i_c3cn_create(); + if (!c3cn) { + cxgb3i_log_info("ep connect OOM.\n"); + err = -ENOMEM; + goto release_conn; + } + + err = cxgb3i_c3cn_connect(c3cn, (struct sockaddr_in *)dst_addr); + if (err < 0) { + cxgb3i_log_info("ep connect failed.\n"); + goto release_conn; + } + hba = cxgb3i_hba_find_by_netdev(c3cn->dst_cache->dev); + if (!hba) { + err = -ENOSPC; + cxgb3i_log_info("NOT going through cxgbi device.\n"); + goto release_conn; + } + if (c3cn_is_closing(c3cn)) { + err = -ENOSPC; + cxgb3i_log_info("ep connect unable to connect.\n"); + goto release_conn; + } + + ep = iscsi_create_endpoint(sizeof(*cep)); + if (!ep) { + err = -ENOMEM; + cxgb3i_log_info("iscsi alloc ep, OOM.\n"); + goto release_conn; + } + cep = ep->dd_data; + cep->c3cn = c3cn; + cep->hba = hba; + + cxgb3i_api_debug("ep 0x%p, 0x%p, c3cn 0x%p, hba 0x%p.\n", + ep, cep, c3cn, hba); + return ep; + +release_conn: + cxgb3i_api_debug("conn 0x%p failed, release.\n", c3cn); + if (c3cn) + cxgb3i_c3cn_release(c3cn); + return ERR_PTR(err); +} + +/** + * cxgb3i_ep_poll - polls for TCP connection establishement + * @ep: TCP connection (endpoint) handle + * @timeout_ms: timeout value in milli secs + * + * polls for TCP connect request to complete + */ +static int cxgb3i_ep_poll(struct iscsi_endpoint *ep, int timeout_ms) +{ + struct cxgb3i_endpoint *cep = ep->dd_data; + struct s3_conn *c3cn = cep->c3cn; + + if (!c3cn_is_established(c3cn)) + return 0; + cxgb3i_api_debug("ep 0x%p, c3cn 0x%p established.\n", ep, c3cn); + return 1; +} + +/** + * cxgb3i_ep_disconnect - teardown TCP connection + * @ep: TCP connection (endpoint) handle + * + * teardown TCP connection + */ +static void cxgb3i_ep_disconnect(struct iscsi_endpoint *ep) +{ + struct cxgb3i_endpoint *cep = ep->dd_data; + struct cxgb3i_conn *cconn = cep->cconn; + + cxgb3i_api_debug("ep 0x%p, cep 0x%p.\n", ep, cep); + + if (cconn && cconn->conn) { + /* + * stop the xmit path so the xmit_pdu function is + * not being called + */ + iscsi_suspend_tx(cconn->conn); + + write_lock_bh(&cep->c3cn->callback_lock); + cep->c3cn->user_data = NULL; + cconn->cep = NULL; + write_unlock_bh(&cep->c3cn->callback_lock); + } + + cxgb3i_api_debug("ep 0x%p, cep 0x%p, release c3cn 0x%p.\n", + ep, cep, cep->c3cn); + cxgb3i_c3cn_release(cep->c3cn); + iscsi_destroy_endpoint(ep); +} + +/** + * cxgb3i_session_create - create a new iscsi session + * @cmds_max: max # of commands + * @qdepth: scsi queue depth + * @initial_cmdsn: initial iscsi CMDSN for this session + * @host_no: pointer to return host no + * + * Creates a new iSCSI session + */ +static struct iscsi_cls_session * +cxgb3i_session_create(struct iscsi_endpoint *ep, u16 cmds_max, u16 qdepth, + u32 initial_cmdsn, u32 *host_no) +{ + struct cxgb3i_endpoint *cep; + struct cxgb3i_hba *hba; + struct Scsi_Host *shost; + struct iscsi_cls_session *cls_session; + struct iscsi_session *session; + + if (!ep) { + cxgb3i_log_error("%s, missing endpoint.\n", __func__); + return NULL; + } + + cep = ep->dd_data; + hba = cep->hba; + shost = hba->shost; + cxgb3i_api_debug("ep 0x%p, cep 0x%p, hba 0x%p.\n", ep, cep, hba); + BUG_ON(hba != iscsi_host_priv(shost)); + + *host_no = shost->host_no; + + cls_session = iscsi_session_setup(&cxgb3i_iscsi_transport, shost, + cmds_max, + sizeof(struct iscsi_tcp_task), + initial_cmdsn, ISCSI_MAX_TARGET); + if (!cls_session) + return NULL; + session = cls_session->dd_data; + if (iscsi_tcp_r2tpool_alloc(session)) + goto remove_session; + + return cls_session; + +remove_session: + iscsi_session_teardown(cls_session); + return NULL; +} + +/** + * cxgb3i_session_destroy - destroys iscsi session + * @cls_session: pointer to iscsi cls session + * + * Destroys an iSCSI session instance and releases its all resources held + */ +static void cxgb3i_session_destroy(struct iscsi_cls_session *cls_session) +{ + cxgb3i_api_debug("sess 0x%p.\n", cls_session); + iscsi_tcp_r2tpool_free(cls_session->dd_data); + iscsi_session_teardown(cls_session); +} + +/** + * cxgb3i_conn_max_xmit_dlength -- check the max. xmit pdu segment size, + * reduce it to be within the hardware limit if needed + * @conn: iscsi connection + */ +static inline int cxgb3i_conn_max_xmit_dlength(struct iscsi_conn *conn) + +{ + struct iscsi_tcp_conn *tcp_conn = conn->dd_data; + struct cxgb3i_conn *cconn = tcp_conn->dd_data; + unsigned int max = min_t(unsigned int, ULP2_MAX_PDU_PAYLOAD, + cconn->hba->snic->tx_max_size - + ISCSI_PDU_NONPAYLOAD_MAX); + + if (conn->max_xmit_dlength) + conn->max_xmit_dlength = min_t(unsigned int, + conn->max_xmit_dlength, max); + else + conn->max_xmit_dlength = max; + align_pdu_size(conn->max_xmit_dlength); + cxgb3i_log_info("conn 0x%p, max xmit %u.\n", + conn, conn->max_xmit_dlength); + return 0; +} + +/** + * cxgb3i_conn_max_recv_dlength -- check the max. recv pdu segment size against + * the hardware limit + * @conn: iscsi connection + * return 0 if the value is valid, < 0 otherwise. + */ +static inline int cxgb3i_conn_max_recv_dlength(struct iscsi_conn *conn) +{ + struct iscsi_tcp_conn *tcp_conn = conn->dd_data; + struct cxgb3i_conn *cconn = tcp_conn->dd_data; + unsigned int max = min_t(unsigned int, ULP2_MAX_PDU_PAYLOAD, + cconn->hba->snic->rx_max_size - + ISCSI_PDU_NONPAYLOAD_MAX); + + align_pdu_size(max); + if (conn->max_recv_dlength) { + if (conn->max_recv_dlength > max) { + cxgb3i_log_error("MaxRecvDataSegmentLength %u too big." + " Need to be <= %u.\n", + conn->max_recv_dlength, max); + return -EINVAL; + } + conn->max_recv_dlength = min_t(unsigned int, + conn->max_recv_dlength, max); + align_pdu_size(conn->max_recv_dlength); + } else + conn->max_recv_dlength = max; + cxgb3i_api_debug("conn 0x%p, max recv %u.\n", + conn, conn->max_recv_dlength); + return 0; +} + +/** + * cxgb3i_conn_create - create iscsi connection instance + * @cls_session: pointer to iscsi cls session + * @cid: iscsi cid + * + * Creates a new iSCSI connection instance for a given session + */ +static struct iscsi_cls_conn *cxgb3i_conn_create(struct iscsi_cls_session + *cls_session, u32 cid) +{ + struct iscsi_cls_conn *cls_conn; + struct iscsi_conn *conn; + struct iscsi_tcp_conn *tcp_conn; + struct cxgb3i_conn *cconn; + + cxgb3i_api_debug("sess 0x%p, cid %u.\n", cls_session, cid); + + cls_conn = iscsi_tcp_conn_setup(cls_session, sizeof(*cconn), cid); + if (!cls_conn) + return NULL; + conn = cls_conn->dd_data; + tcp_conn = conn->dd_data; + cconn = tcp_conn->dd_data; + + cconn->conn = conn; + return cls_conn; +} + +/** + * cxgb3i_conn_bind - binds iscsi sess, conn and endpoint together + * @cls_session: pointer to iscsi cls session + * @cls_conn: pointer to iscsi cls conn + * @transport_eph: 64-bit EP handle + * @is_leading: leading connection on this session? + * + * Binds together an iSCSI session, an iSCSI connection and a + * TCP connection. This routine returns error code if the TCP + * connection does not belong on the device iSCSI sess/conn is bound + */ + +static int cxgb3i_conn_bind(struct iscsi_cls_session *cls_session, + struct iscsi_cls_conn *cls_conn, + u64 transport_eph, int is_leading) +{ + struct iscsi_conn *conn = cls_conn->dd_data; + struct iscsi_tcp_conn *tcp_conn = conn->dd_data; + struct cxgb3i_conn *cconn = tcp_conn->dd_data; + struct cxgb3i_adapter *snic; + struct iscsi_endpoint *ep; + struct cxgb3i_endpoint *cep; + struct s3_conn *c3cn; + int err; + + ep = iscsi_lookup_endpoint(transport_eph); + if (!ep) + return -EINVAL; + + /* setup ddp pagesize */ + cep = ep->dd_data; + c3cn = cep->c3cn; + snic = cep->hba->snic; + err = cxgb3i_setup_conn_host_pagesize(snic->tdev, c3cn->tid, 0); + if (err < 0) + return err; + + cxgb3i_api_debug("ep 0x%p, cls sess 0x%p, cls conn 0x%p.\n", + ep, cls_session, cls_conn); + + err = iscsi_conn_bind(cls_session, cls_conn, is_leading); + if (err) + return -EINVAL; + + /* calculate the tag idx bits needed for this conn based on cmds_max */ + cconn->task_idx_bits = (__ilog2_u32(conn->session->cmds_max - 1)) + 1; + cxgb3i_api_debug("session cmds_max 0x%x, bits %u.\n", + conn->session->cmds_max, cconn->task_idx_bits); + + read_lock(&c3cn->callback_lock); + c3cn->user_data = conn; + cconn->hba = cep->hba; + cconn->cep = cep; + cep->cconn = cconn; + read_unlock(&c3cn->callback_lock); + + cxgb3i_conn_max_xmit_dlength(conn); + cxgb3i_conn_max_recv_dlength(conn); + + spin_lock_bh(&conn->session->lock); + sprintf(conn->portal_address, NIPQUAD_FMT, + NIPQUAD(c3cn->daddr.sin_addr.s_addr)); + conn->portal_port = ntohs(c3cn->daddr.sin_port); + spin_unlock_bh(&conn->session->lock); + + /* init recv engine */ + iscsi_tcp_hdr_recv_prep(tcp_conn); + + return 0; +} + +/** + * cxgb3i_conn_get_param - return iscsi connection parameter to caller + * @cls_conn: pointer to iscsi cls conn + * @param: parameter type identifier + * @buf: buffer pointer + * + * returns iSCSI connection parameters + */ +static int cxgb3i_conn_get_param(struct iscsi_cls_conn *cls_conn, + enum iscsi_param param, char *buf) +{ + struct iscsi_conn *conn = cls_conn->dd_data; + int len; + + cxgb3i_api_debug("cls_conn 0x%p, param %d.\n", cls_conn, param); + + switch (param) { + case ISCSI_PARAM_CONN_PORT: + spin_lock_bh(&conn->session->lock); + len = sprintf(buf, "%hu\n", conn->portal_port); + spin_unlock_bh(&conn->session->lock); + break; + case ISCSI_PARAM_CONN_ADDRESS: + spin_lock_bh(&conn->session->lock); + len = sprintf(buf, "%s\n", conn->portal_address); + spin_unlock_bh(&conn->session->lock); + break; + default: + return iscsi_conn_get_param(cls_conn, param, buf); + } + + return len; +} + +/** + * cxgb3i_conn_set_param - set iscsi connection parameter + * @cls_conn: pointer to iscsi cls conn + * @param: parameter type identifier + * @buf: buffer pointer + * @buflen: buffer length + * + * set iSCSI connection parameters + */ +static int cxgb3i_conn_set_param(struct iscsi_cls_conn *cls_conn, + enum iscsi_param param, char *buf, int buflen) +{ + struct iscsi_conn *conn = cls_conn->dd_data; + struct iscsi_session *session = conn->session; + struct iscsi_tcp_conn *tcp_conn = conn->dd_data; + struct cxgb3i_conn *cconn = tcp_conn->dd_data; + struct cxgb3i_adapter *snic = cconn->hba->snic; + struct s3_conn *c3cn = cconn->cep->c3cn; + int value, err = 0; + + switch (param) { + case ISCSI_PARAM_HDRDGST_EN: + err = iscsi_set_param(cls_conn, param, buf, buflen); + if (!err && conn->hdrdgst_en) + err = cxgb3i_setup_conn_digest(snic->tdev, c3cn->tid, + conn->hdrdgst_en, + conn->datadgst_en, 0); + break; + case ISCSI_PARAM_DATADGST_EN: + err = iscsi_set_param(cls_conn, param, buf, buflen); + if (!err && conn->datadgst_en) + err = cxgb3i_setup_conn_digest(snic->tdev, c3cn->tid, + conn->hdrdgst_en, + conn->datadgst_en, 0); + break; + case ISCSI_PARAM_MAX_R2T: + sscanf(buf, "%d", &value); + if (value <= 0 || !is_power_of_2(value)) + return -EINVAL; + if (session->max_r2t == value) + break; + iscsi_tcp_r2tpool_free(session); + err = iscsi_set_param(cls_conn, param, buf, buflen); + if (!err && iscsi_tcp_r2tpool_alloc(session)) + return -ENOMEM; + case ISCSI_PARAM_MAX_RECV_DLENGTH: + err = iscsi_set_param(cls_conn, param, buf, buflen); + if (!err) + err = cxgb3i_conn_max_recv_dlength(conn); + break; + case ISCSI_PARAM_MAX_XMIT_DLENGTH: + err = iscsi_set_param(cls_conn, param, buf, buflen); + if (!err) + err = cxgb3i_conn_max_xmit_dlength(conn); + break; + default: + return iscsi_set_param(cls_conn, param, buf, buflen); + } + return err; +} + +/** + * cxgb3i_host_set_param - configure host (adapter) related parameters + * @shost: scsi host pointer + * @param: parameter type identifier + * @buf: buffer pointer + */ +static int cxgb3i_host_set_param(struct Scsi_Host *shost, + enum iscsi_host_param param, + char *buf, int buflen) +{ + struct cxgb3i_hba *hba = iscsi_host_priv(shost); + + cxgb3i_api_debug("param %d, buf %s.\n", param, buf); + + switch (param) { + case ISCSI_HOST_PARAM_IPADDRESS: + { + __be32 addr = in_aton(buf); + cxgb3i_set_private_ipv4addr(hba->ndev, addr); + return 0; + } + case ISCSI_HOST_PARAM_HWADDRESS: + case ISCSI_HOST_PARAM_NETDEV_NAME: + /* ignore */ + return 0; + default: + return iscsi_host_set_param(shost, param, buf, buflen); + } +} + +/** + * cxgb3i_host_get_param - returns host (adapter) related parameters + * @shost: scsi host pointer + * @param: parameter type identifier + * @buf: buffer pointer + */ +static int cxgb3i_host_get_param(struct Scsi_Host *shost, + enum iscsi_host_param param, char *buf) +{ + struct cxgb3i_hba *hba = iscsi_host_priv(shost); + int len = 0; + + cxgb3i_api_debug("hba %s, param %d.\n", hba->ndev->name, param); + + switch (param) { + case ISCSI_HOST_PARAM_HWADDRESS: + len = sysfs_format_mac(buf, hba->ndev->dev_addr, 6); + break; + case ISCSI_HOST_PARAM_NETDEV_NAME: + len = sprintf(buf, "%s\n", hba->ndev->name); + break; + case ISCSI_HOST_PARAM_IPADDRESS: + { + __be32 addr; + + addr = cxgb3i_get_private_ipv4addr(hba->ndev); + len = sprintf(buf, NIPQUAD_FMT, NIPQUAD(addr)); + break; + } + default: + return iscsi_host_get_param(shost, param, buf); + } + return len; +} + +/** + * cxgb3i_conn_get_stats - returns iSCSI stats + * @cls_conn: pointer to iscsi cls conn + * @stats: pointer to iscsi statistic struct + */ +static void cxgb3i_conn_get_stats(struct iscsi_cls_conn *cls_conn, + struct iscsi_stats *stats) +{ + struct iscsi_conn *conn = cls_conn->dd_data; + + stats->txdata_octets = conn->txdata_octets; + stats->rxdata_octets = conn->rxdata_octets; + stats->scsicmd_pdus = conn->scsicmd_pdus_cnt; + stats->dataout_pdus = conn->dataout_pdus_cnt; + stats->scsirsp_pdus = conn->scsirsp_pdus_cnt; + stats->datain_pdus = conn->datain_pdus_cnt; + stats->r2t_pdus = conn->r2t_pdus_cnt; + stats->tmfcmd_pdus = conn->tmfcmd_pdus_cnt; + stats->tmfrsp_pdus = conn->tmfrsp_pdus_cnt; + stats->digest_err = 0; + stats->timeout_err = 0; + stats->custom_length = 1; + strcpy(stats->custom[0].desc, "eh_abort_cnt"); + stats->custom[0].value = conn->eh_abort_cnt; +} + +/** + * cxgb3i_parse_itt - get the idx and age bits from a given tag + * @conn: iscsi connection + * @itt: itt tag + * @idx: task index, filled in by this function + * @age: session age, filled in by this function + */ +static void cxgb3i_parse_itt(struct iscsi_conn *conn, itt_t itt, + int *idx, int *age) +{ + struct iscsi_tcp_conn *tcp_conn = conn->dd_data; + struct cxgb3i_conn *cconn = tcp_conn->dd_data; + struct cxgb3i_adapter *snic = cconn->hba->snic; + u32 tag = ntohl((__force u32) itt); + u32 sw_bits; + + sw_bits = cxgb3i_tag_nonrsvd_bits(&snic->tag_format, tag); + if (idx) + *idx = sw_bits & ((1 << cconn->task_idx_bits) - 1); + if (age) + *age = (sw_bits >> cconn->task_idx_bits) & ISCSI_AGE_MASK; + + cxgb3i_tag_debug("parse tag 0x%x/0x%x, sw 0x%x, itt 0x%x, age 0x%x.\n", + tag, itt, sw_bits, idx ? *idx : 0xFFFFF, + age ? *age : 0xFF); +} + +/** + * cxgb3i_reserve_itt - generate tag for a give task + * Try to set up ddp for a scsi read task. + * @task: iscsi task + * @hdr_itt: tag, filled in by this function + */ +int cxgb3i_reserve_itt(struct iscsi_task *task, itt_t *hdr_itt) +{ + struct scsi_cmnd *sc = task->sc; + struct iscsi_conn *conn = task->conn; + struct iscsi_session *sess = conn->session; + struct iscsi_tcp_conn *tcp_conn = conn->dd_data; + struct cxgb3i_conn *cconn = tcp_conn->dd_data; + struct cxgb3i_adapter *snic = cconn->hba->snic; + struct cxgb3i_tag_format *tformat = &snic->tag_format; + u32 sw_tag = (sess->age << cconn->task_idx_bits) | task->itt; + u32 tag; + int err = -EINVAL; + + if (sc && + (scsi_bidi_cmnd(sc) || sc->sc_data_direction == DMA_FROM_DEVICE) && + cxgb3i_sw_tag_usable(tformat, sw_tag)) { + struct s3_conn *c3cn = cconn->cep->c3cn; + struct cxgb3i_gather_list *gl; + + gl = cxgb3i_ddp_make_gl(scsi_in(sc)->length, + scsi_in(sc)->table.sgl, + scsi_in(sc)->table.nents, + snic->pdev, + GFP_ATOMIC); + if (gl) { + tag = sw_tag; + err = cxgb3i_ddp_tag_reserve(snic->tdev, c3cn->tid, + tformat, &tag, + gl, GFP_ATOMIC); + if (err < 0) + cxgb3i_ddp_release_gl(gl, snic->pdev); + } + } + + if (err < 0) + tag = cxgb3i_set_non_ddp_tag(tformat, sw_tag); + /* the itt need to sent in big-endian order */ + *hdr_itt = (__force itt_t)htonl(tag); + + cxgb3i_tag_debug("new tag 0x%x/0x%x (itt 0x%x, age 0x%x).\n", + tag, *hdr_itt, task->itt, sess->age); + return 0; +} + +/** + * cxgb3i_release_itt - release the tag for a given task + * if the tag is a ddp tag, release the ddp setup + * @task: iscsi task + * @hdr_itt: tag + */ +void cxgb3i_release_itt(struct iscsi_task *task, itt_t hdr_itt) +{ + struct scsi_cmnd *sc = task->sc; + struct iscsi_tcp_conn *tcp_conn = task->conn->dd_data; + struct cxgb3i_conn *cconn = tcp_conn->dd_data; + struct cxgb3i_adapter *snic = cconn->hba->snic; + struct cxgb3i_tag_format *tformat = &snic->tag_format; + u32 tag = ntohl((__force u32)hdr_itt); + + cxgb3i_tag_debug("release tag 0x%x.\n", tag); + + if (sc && + (scsi_bidi_cmnd(sc) || sc->sc_data_direction == DMA_FROM_DEVICE) && + cxgb3i_is_ddp_tag(tformat, tag)) + cxgb3i_ddp_tag_release(snic->tdev, tag); +} + +/** + * cxgb3i_host_template -- Scsi_Host_Template structure + * used when registering with the scsi mid layer + */ +static struct scsi_host_template cxgb3i_host_template = { + .module = THIS_MODULE, + .name = "Chelsio S3xx iSCSI Initiator", + .proc_name = "cxgb3i", + .queuecommand = iscsi_queuecommand, + .change_queue_depth = iscsi_change_queue_depth, + .can_queue = 128 * (ISCSI_DEF_XMIT_CMDS_MAX - 1), + .sg_tablesize = SG_ALL, + .max_sectors = 0xFFFF, + .cmd_per_lun = ISCSI_DEF_CMD_PER_LUN, + .eh_abort_handler = iscsi_eh_abort, + .eh_device_reset_handler = iscsi_eh_device_reset, + .eh_target_reset_handler = iscsi_eh_target_reset, + .use_clustering = DISABLE_CLUSTERING, + .this_id = -1, +}; + +static struct iscsi_transport cxgb3i_iscsi_transport = { + .owner = THIS_MODULE, + .name = "cxgb3i", + .caps = CAP_RECOVERY_L0 | CAP_MULTI_R2T | CAP_HDRDGST + | CAP_DATADGST | CAP_DIGEST_OFFLOAD | + CAP_PADDING_OFFLOAD, + .param_mask = ISCSI_MAX_RECV_DLENGTH | + ISCSI_MAX_XMIT_DLENGTH | + ISCSI_HDRDGST_EN | + ISCSI_DATADGST_EN | + ISCSI_INITIAL_R2T_EN | + ISCSI_MAX_R2T | + ISCSI_IMM_DATA_EN | + ISCSI_FIRST_BURST | + ISCSI_MAX_BURST | + ISCSI_PDU_INORDER_EN | + ISCSI_DATASEQ_INORDER_EN | + ISCSI_ERL | + ISCSI_CONN_PORT | + ISCSI_CONN_ADDRESS | + ISCSI_EXP_STATSN | + ISCSI_PERSISTENT_PORT | + ISCSI_PERSISTENT_ADDRESS | + ISCSI_TARGET_NAME | ISCSI_TPGT | + ISCSI_USERNAME | ISCSI_PASSWORD | + ISCSI_USERNAME_IN | ISCSI_PASSWORD_IN | + ISCSI_FAST_ABORT | ISCSI_ABORT_TMO | + ISCSI_LU_RESET_TMO | + ISCSI_PING_TMO | ISCSI_RECV_TMO | + ISCSI_IFACE_NAME | ISCSI_INITIATOR_NAME, + .host_param_mask = ISCSI_HOST_HWADDRESS | ISCSI_HOST_IPADDRESS | + ISCSI_HOST_INITIATOR_NAME | ISCSI_HOST_NETDEV_NAME, + .get_host_param = cxgb3i_host_get_param, + .set_host_param = cxgb3i_host_set_param, + /* session management */ + .create_session = cxgb3i_session_create, + .destroy_session = cxgb3i_session_destroy, + .get_session_param = iscsi_session_get_param, + /* connection management */ + .create_conn = cxgb3i_conn_create, + .bind_conn = cxgb3i_conn_bind, + .destroy_conn = iscsi_tcp_conn_teardown, + .start_conn = iscsi_conn_start, + .stop_conn = iscsi_conn_stop, + .get_conn_param = cxgb3i_conn_get_param, + .set_param = cxgb3i_conn_set_param, + .get_stats = cxgb3i_conn_get_stats, + /* pdu xmit req. from user space */ + .send_pdu = iscsi_conn_send_pdu, + /* task */ + .init_task = iscsi_tcp_task_init, + .xmit_task = iscsi_tcp_task_xmit, + .cleanup_task = cxgb3i_conn_cleanup_task, + + /* pdu */ + .alloc_pdu = cxgb3i_conn_alloc_pdu, + .init_pdu = cxgb3i_conn_init_pdu, + .xmit_pdu = cxgb3i_conn_xmit_pdu, + .parse_pdu_itt = cxgb3i_parse_itt, + + /* TCP connect/disconnect */ + .ep_connect = cxgb3i_ep_connect, + .ep_poll = cxgb3i_ep_poll, + .ep_disconnect = cxgb3i_ep_disconnect, + /* Error recovery timeout call */ + .session_recovery_timedout = iscsi_session_recovery_timedout, +}; + +int cxgb3i_iscsi_init(void) +{ + sw_tag_idx_bits = (__ilog2_u32(ISCSI_ITT_MASK)) + 1; + sw_tag_age_bits = (__ilog2_u32(ISCSI_AGE_MASK)) + 1; + cxgb3i_log_info("tag itt 0x%x, %u bits, age 0x%x, %u bits.\n", + ISCSI_ITT_MASK, sw_tag_idx_bits, + ISCSI_AGE_MASK, sw_tag_age_bits); + + cxgb3i_scsi_transport = + iscsi_register_transport(&cxgb3i_iscsi_transport); + if (!cxgb3i_scsi_transport) { + cxgb3i_log_error("Could not register cxgb3i transport.\n"); + return -ENODEV; + } + cxgb3i_api_debug("cxgb3i transport 0x%p.\n", cxgb3i_scsi_transport); + return 0; +} + +void cxgb3i_iscsi_cleanup(void) +{ + if (cxgb3i_scsi_transport) { + cxgb3i_api_debug("cxgb3i transport 0x%p.\n", + cxgb3i_scsi_transport); + iscsi_unregister_transport(&cxgb3i_iscsi_transport); + } +} diff --git a/drivers/scsi/cxgb3i/cxgb3i_offload.c b/drivers/scsi/cxgb3i/cxgb3i_offload.c new file mode 100644 index 00000000000..5f16081b68d --- /dev/null +++ b/drivers/scsi/cxgb3i/cxgb3i_offload.c @@ -0,0 +1,1810 @@ +/* + * cxgb3i_offload.c: Chelsio S3xx iscsi offloaded tcp connection management + * + * Copyright (C) 2003-2008 Chelsio Communications. All rights reserved. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the LICENSE file included in this + * release for licensing terms and conditions. + * + * Written by: Dimitris Michailidis (dm@chelsio.com) + * Karen Xie (kxie@chelsio.com) + */ + +#include +#include + +#include "cxgb3_defs.h" +#include "cxgb3_ctl_defs.h" +#include "firmware_exports.h" +#include "cxgb3i_offload.h" +#include "cxgb3i_pdu.h" +#include "cxgb3i_ddp.h" + +#ifdef __DEBUG_C3CN_CONN__ +#define c3cn_conn_debug cxgb3i_log_info +#else +#define c3cn_conn_debug(fmt...) +#endif + +#ifdef __DEBUG_C3CN_TX__ +#define c3cn_tx_debug cxgb3i_log_debug +#else +#define c3cn_tx_debug(fmt...) +#endif + +#ifdef __DEBUG_C3CN_RX__ +#define c3cn_rx_debug cxgb3i_log_debug +#else +#define c3cn_rx_debug(fmt...) +#endif + +/* + * module parameters releated to offloaded iscsi connection + */ +static int cxgb3_rcv_win = 256 * 1024; +module_param(cxgb3_rcv_win, int, 0644); +MODULE_PARM_DESC(cxgb3_rcv_win, "TCP receive window in bytes (default=256KB)"); + +static int cxgb3_snd_win = 64 * 1024; +module_param(cxgb3_snd_win, int, 0644); +MODULE_PARM_DESC(cxgb3_snd_win, "TCP send window in bytes (default=64KB)"); + +static int cxgb3_rx_credit_thres = 10 * 1024; +module_param(cxgb3_rx_credit_thres, int, 0644); +MODULE_PARM_DESC(rx_credit_thres, + "RX credits return threshold in bytes (default=10KB)"); + +static unsigned int cxgb3_max_connect = 8 * 1024; +module_param(cxgb3_max_connect, uint, 0644); +MODULE_PARM_DESC(cxgb3_max_connect, "Max. # of connections (default=8092)"); + +static unsigned int cxgb3_sport_base = 20000; +module_param(cxgb3_sport_base, uint, 0644); +MODULE_PARM_DESC(cxgb3_sport_base, "starting port number (default=20000)"); + +/* + * cxgb3i tcp connection data(per adapter) list + */ +static LIST_HEAD(cdata_list); +static DEFINE_RWLOCK(cdata_rwlock); + +static int c3cn_push_tx_frames(struct s3_conn *c3cn, int req_completion); +static void c3cn_release_offload_resources(struct s3_conn *c3cn); + +/* + * iscsi source port management + * + * Find a free source port in the port allocation map. We use a very simple + * rotor scheme to look for the next free port. + * + * If a source port has been specified make sure that it doesn't collide with + * our normal source port allocation map. If it's outside the range of our + * allocation/deallocation scheme just let them use it. + * + * If the source port is outside our allocation range, the caller is + * responsible for keeping track of their port usage. + */ +static int c3cn_get_port(struct s3_conn *c3cn, struct cxgb3i_sdev_data *cdata) +{ + unsigned int start; + int idx; + + if (!cdata) + goto error_out; + + if (c3cn->saddr.sin_port != 0) { + idx = ntohs(c3cn->saddr.sin_port) - cxgb3_sport_base; + if (idx < 0 || idx >= cxgb3_max_connect) + return 0; + if (!test_and_set_bit(idx, cdata->sport_map)) + return -EADDRINUSE; + } + + /* the sport_map_next may not be accurate but that is okay, sport_map + should be */ + start = idx = cdata->sport_map_next; + do { + if (++idx >= cxgb3_max_connect) + idx = 0; + if (!(test_and_set_bit(idx, cdata->sport_map))) { + c3cn->saddr.sin_port = htons(cxgb3_sport_base + idx); + cdata->sport_map_next = idx; + c3cn_conn_debug("%s reserve port %u.\n", + cdata->cdev->name, + cxgb3_sport_base + idx); + return 0; + } + } while (idx != start); + +error_out: + return -EADDRNOTAVAIL; +} + +static void c3cn_put_port(struct s3_conn *c3cn) +{ + struct cxgb3i_sdev_data *cdata = CXGB3_SDEV_DATA(c3cn->cdev); + + if (c3cn->saddr.sin_port) { + int idx = ntohs(c3cn->saddr.sin_port) - cxgb3_sport_base; + + c3cn->saddr.sin_port = 0; + if (idx < 0 || idx >= cxgb3_max_connect) + return; + clear_bit(idx, cdata->sport_map); + c3cn_conn_debug("%s, release port %u.\n", + cdata->cdev->name, cxgb3_sport_base + idx); + } +} + +static inline void c3cn_set_flag(struct s3_conn *c3cn, enum c3cn_flags flag) +{ + __set_bit(flag, &c3cn->flags); + c3cn_conn_debug("c3cn 0x%p, set %d, s %u, f 0x%lx.\n", + c3cn, flag, c3cn->state, c3cn->flags); +} + +static inline void c3cn_clear_flag(struct s3_conn *c3cn, enum c3cn_flags flag) +{ + __clear_bit(flag, &c3cn->flags); + c3cn_conn_debug("c3cn 0x%p, clear %d, s %u, f 0x%lx.\n", + c3cn, flag, c3cn->state, c3cn->flags); +} + +static inline int c3cn_flag(struct s3_conn *c3cn, enum c3cn_flags flag) +{ + if (c3cn == NULL) + return 0; + return test_bit(flag, &c3cn->flags); +} + +static void c3cn_set_state(struct s3_conn *c3cn, int state) +{ + c3cn_conn_debug("c3cn 0x%p state -> %u.\n", c3cn, state); + c3cn->state = state; +} + +static inline void c3cn_hold(struct s3_conn *c3cn) +{ + atomic_inc(&c3cn->refcnt); +} + +static inline void c3cn_put(struct s3_conn *c3cn) +{ + if (atomic_dec_and_test(&c3cn->refcnt)) { + c3cn_conn_debug("free c3cn 0x%p, s %u, f 0x%lx.\n", + c3cn, c3cn->state, c3cn->flags); + kfree(c3cn); + } +} + +static void c3cn_closed(struct s3_conn *c3cn) +{ + c3cn_conn_debug("c3cn 0x%p, state %u, flag 0x%lx.\n", + c3cn, c3cn->state, c3cn->flags); + + c3cn_put_port(c3cn); + c3cn_release_offload_resources(c3cn); + c3cn_set_state(c3cn, C3CN_STATE_CLOSED); + cxgb3i_conn_closing(c3cn); +} + +/* + * CPL (Chelsio Protocol Language) defines a message passing interface between + * the host driver and T3 asic. + * The section below implments CPLs that related to iscsi tcp connection + * open/close/abort and data send/receive. + */ + +/* + * CPL connection active open request: host -> + */ +static unsigned int find_best_mtu(const struct t3c_data *d, unsigned short mtu) +{ + int i = 0; + + while (i < d->nmtus - 1 && d->mtus[i + 1] <= mtu) + ++i; + return i; +} + +static unsigned int select_mss(struct s3_conn *c3cn, unsigned int pmtu) +{ + unsigned int idx; + struct dst_entry *dst = c3cn->dst_cache; + struct t3cdev *cdev = c3cn->cdev; + const struct t3c_data *td = T3C_DATA(cdev); + u16 advmss = dst_metric(dst, RTAX_ADVMSS); + + if (advmss > pmtu - 40) + advmss = pmtu - 40; + if (advmss < td->mtus[0] - 40) + advmss = td->mtus[0] - 40; + idx = find_best_mtu(td, advmss + 40); + return idx; +} + +static inline int compute_wscale(int win) +{ + int wscale = 0; + while (wscale < 14 && (65535<mss_idx); +} + +static inline unsigned int calc_opt0l(struct s3_conn *c3cn) +{ + return V_ULP_MODE(ULP_MODE_ISCSI) | + V_RCV_BUFSIZ(cxgb3_rcv_win>>10); +} + +static void make_act_open_req(struct s3_conn *c3cn, struct sk_buff *skb, + unsigned int atid, const struct l2t_entry *e) +{ + struct cpl_act_open_req *req; + + c3cn_conn_debug("c3cn 0x%p, atid 0x%x.\n", c3cn, atid); + + skb->priority = CPL_PRIORITY_SETUP; + req = (struct cpl_act_open_req *)__skb_put(skb, sizeof(*req)); + req->wr.wr_hi = htonl(V_WR_OP(FW_WROPCODE_FORWARD)); + OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_ACT_OPEN_REQ, atid)); + req->local_port = c3cn->saddr.sin_port; + req->peer_port = c3cn->daddr.sin_port; + req->local_ip = c3cn->saddr.sin_addr.s_addr; + req->peer_ip = c3cn->daddr.sin_addr.s_addr; + req->opt0h = htonl(calc_opt0h(c3cn) | V_L2T_IDX(e->idx) | + V_TX_CHANNEL(e->smt_idx)); + req->opt0l = htonl(calc_opt0l(c3cn)); + req->params = 0; +} + +static void fail_act_open(struct s3_conn *c3cn, int errno) +{ + c3cn_conn_debug("c3cn 0x%p, state %u, flag 0x%lx.\n", + c3cn, c3cn->state, c3cn->flags); + c3cn->err = errno; + c3cn_closed(c3cn); +} + +static void act_open_req_arp_failure(struct t3cdev *dev, struct sk_buff *skb) +{ + struct s3_conn *c3cn = (struct s3_conn *)skb->sk; + + c3cn_conn_debug("c3cn 0x%p, state %u.\n", c3cn, c3cn->state); + + c3cn_hold(c3cn); + spin_lock_bh(&c3cn->lock); + if (c3cn->state == C3CN_STATE_CONNECTING) + fail_act_open(c3cn, EHOSTUNREACH); + spin_unlock_bh(&c3cn->lock); + c3cn_put(c3cn); + __kfree_skb(skb); +} + +/* + * CPL connection close request: host -> + * + * Close a connection by sending a CPL_CLOSE_CON_REQ message and queue it to + * the write queue (i.e., after any unsent txt data). + */ +static void skb_entail(struct s3_conn *c3cn, struct sk_buff *skb, + int flags) +{ + CXGB3_SKB_CB(skb)->seq = c3cn->write_seq; + CXGB3_SKB_CB(skb)->flags = flags; + __skb_queue_tail(&c3cn->write_queue, skb); +} + +static void send_close_req(struct s3_conn *c3cn) +{ + struct sk_buff *skb = c3cn->cpl_close; + struct cpl_close_con_req *req = (struct cpl_close_con_req *)skb->head; + unsigned int tid = c3cn->tid; + + c3cn_conn_debug("c3cn 0x%p, state 0x%x, flag 0x%lx.\n", + c3cn, c3cn->state, c3cn->flags); + + c3cn->cpl_close = NULL; + + req->wr.wr_hi = htonl(V_WR_OP(FW_WROPCODE_OFLD_CLOSE_CON)); + req->wr.wr_lo = htonl(V_WR_TID(tid)); + OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_CLOSE_CON_REQ, tid)); + req->rsvd = htonl(c3cn->write_seq); + + skb_entail(c3cn, skb, C3CB_FLAG_NO_APPEND); + if (c3cn->state != C3CN_STATE_CONNECTING) + c3cn_push_tx_frames(c3cn, 1); +} + +/* + * CPL connection abort request: host -> + * + * Send an ABORT_REQ message. Makes sure we do not send multiple ABORT_REQs + * for the same connection and also that we do not try to send a message + * after the connection has closed. + */ +static void abort_arp_failure(struct t3cdev *cdev, struct sk_buff *skb) +{ + struct cpl_abort_req *req = cplhdr(skb); + + c3cn_conn_debug("tdev 0x%p.\n", cdev); + + req->cmd = CPL_ABORT_NO_RST; + cxgb3_ofld_send(cdev, skb); +} + +static inline void c3cn_purge_write_queue(struct s3_conn *c3cn) +{ + struct sk_buff *skb; + + while ((skb = __skb_dequeue(&c3cn->write_queue))) + __kfree_skb(skb); +} + +static void send_abort_req(struct s3_conn *c3cn) +{ + struct sk_buff *skb = c3cn->cpl_abort_req; + struct cpl_abort_req *req; + unsigned int tid = c3cn->tid; + + if (unlikely(c3cn->state == C3CN_STATE_ABORTING) || !skb || + !c3cn->cdev) + return; + + c3cn_set_state(c3cn, C3CN_STATE_ABORTING); + + c3cn_conn_debug("c3cn 0x%p, flag ABORT_RPL + ABORT_SHUT.\n", c3cn); + + c3cn_set_flag(c3cn, C3CN_ABORT_RPL_PENDING); + + /* Purge the send queue so we don't send anything after an abort. */ + c3cn_purge_write_queue(c3cn); + + c3cn->cpl_abort_req = NULL; + req = (struct cpl_abort_req *)skb->head; + + skb->priority = CPL_PRIORITY_DATA; + set_arp_failure_handler(skb, abort_arp_failure); + + req->wr.wr_hi = htonl(V_WR_OP(FW_WROPCODE_OFLD_HOST_ABORT_CON_REQ)); + req->wr.wr_lo = htonl(V_WR_TID(tid)); + OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_ABORT_REQ, tid)); + req->rsvd0 = htonl(c3cn->snd_nxt); + req->rsvd1 = !c3cn_flag(c3cn, C3CN_TX_DATA_SENT); + req->cmd = CPL_ABORT_SEND_RST; + + l2t_send(c3cn->cdev, skb, c3cn->l2t); +} + +/* + * CPL connection abort reply: host -> + * + * Send an ABORT_RPL message in response of the ABORT_REQ received. + */ +static void send_abort_rpl(struct s3_conn *c3cn, int rst_status) +{ + struct sk_buff *skb = c3cn->cpl_abort_rpl; + struct cpl_abort_rpl *rpl = (struct cpl_abort_rpl *)skb->head; + + c3cn->cpl_abort_rpl = NULL; + + skb->priority = CPL_PRIORITY_DATA; + rpl->wr.wr_hi = htonl(V_WR_OP(FW_WROPCODE_OFLD_HOST_ABORT_CON_RPL)); + rpl->wr.wr_lo = htonl(V_WR_TID(c3cn->tid)); + OPCODE_TID(rpl) = htonl(MK_OPCODE_TID(CPL_ABORT_RPL, c3cn->tid)); + rpl->cmd = rst_status; + + cxgb3_ofld_send(c3cn->cdev, skb); +} + +/* + * CPL connection rx data ack: host -> + * Send RX credits through an RX_DATA_ACK CPL message. Returns the number of + * credits sent. + */ +static u32 send_rx_credits(struct s3_conn *c3cn, u32 credits, u32 dack) +{ + struct sk_buff *skb; + struct cpl_rx_data_ack *req; + + skb = alloc_skb(sizeof(*req), GFP_ATOMIC); + if (!skb) + return 0; + + req = (struct cpl_rx_data_ack *)__skb_put(skb, sizeof(*req)); + req->wr.wr_hi = htonl(V_WR_OP(FW_WROPCODE_FORWARD)); + OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_RX_DATA_ACK, c3cn->tid)); + req->credit_dack = htonl(dack | V_RX_CREDITS(credits)); + skb->priority = CPL_PRIORITY_ACK; + cxgb3_ofld_send(c3cn->cdev, skb); + return credits; +} + +/* + * CPL connection tx data: host -> + * + * Send iscsi PDU via TX_DATA CPL message. Returns the number of + * credits sent. + * Each TX_DATA consumes work request credit (wrs), so we need to keep track of + * how many we've used so far and how many are pending (i.e., yet ack'ed by T3). + */ + +/* + * For ULP connections HW may inserts digest bytes into the pdu. Those digest + * bytes are not sent by the host but are part of the TCP payload and therefore + * consume TCP sequence space. + */ +static const unsigned int cxgb3_ulp_extra_len[] = { 0, 4, 4, 8 }; +static inline unsigned int ulp_extra_len(const struct sk_buff *skb) +{ + return cxgb3_ulp_extra_len[skb_ulp_mode(skb) & 3]; +} + +static unsigned int wrlen __read_mostly; + +/* + * The number of WRs needed for an skb depends on the number of fragments + * in the skb and whether it has any payload in its main body. This maps the + * length of the gather list represented by an skb into the # of necessary WRs. + * + * The max. length of an skb is controlled by the max pdu size which is ~16K. + * Also, assume the min. fragment length is the sector size (512), then add + * extra fragment counts for iscsi bhs and payload padding. + */ +#define SKB_WR_LIST_SIZE (16384/512 + 3) +static unsigned int skb_wrs[SKB_WR_LIST_SIZE] __read_mostly; + +static void s3_init_wr_tab(unsigned int wr_len) +{ + int i; + + if (skb_wrs[1]) /* already initialized */ + return; + + for (i = 1; i < SKB_WR_LIST_SIZE; i++) { + int sgl_len = (3 * i) / 2 + (i & 1); + + sgl_len += 3; + skb_wrs[i] = (sgl_len <= wr_len + ? 1 : 1 + (sgl_len - 2) / (wr_len - 1)); + } + + wrlen = wr_len * 8; +} + +static inline void reset_wr_list(struct s3_conn *c3cn) +{ + c3cn->wr_pending_head = NULL; +} + +/* + * Add a WR to a connections's list of pending WRs. This is a singly-linked + * list of sk_buffs operating as a FIFO. The head is kept in wr_pending_head + * and the tail in wr_pending_tail. + */ +static inline void enqueue_wr(struct s3_conn *c3cn, + struct sk_buff *skb) +{ + skb->sp = NULL; + + /* + * We want to take an extra reference since both us and the driver + * need to free the packet before it's really freed. We know there's + * just one user currently so we use atomic_set rather than skb_get + * to avoid the atomic op. + */ + atomic_set(&skb->users, 2); + + if (!c3cn->wr_pending_head) + c3cn->wr_pending_head = skb; + else + c3cn->wr_pending_tail->sp = (void *)skb; + c3cn->wr_pending_tail = skb; +} + +static inline struct sk_buff *peek_wr(const struct s3_conn *c3cn) +{ + return c3cn->wr_pending_head; +} + +static inline void free_wr_skb(struct sk_buff *skb) +{ + kfree_skb(skb); +} + +static inline struct sk_buff *dequeue_wr(struct s3_conn *c3cn) +{ + struct sk_buff *skb = c3cn->wr_pending_head; + + if (likely(skb)) { + /* Don't bother clearing the tail */ + c3cn->wr_pending_head = (struct sk_buff *)skb->sp; + skb->sp = NULL; + } + return skb; +} + +static void purge_wr_queue(struct s3_conn *c3cn) +{ + struct sk_buff *skb; + while ((skb = dequeue_wr(c3cn)) != NULL) + free_wr_skb(skb); +} + +static inline void make_tx_data_wr(struct s3_conn *c3cn, struct sk_buff *skb, + int len) +{ + struct tx_data_wr *req; + + skb_reset_transport_header(skb); + req = (struct tx_data_wr *)__skb_push(skb, sizeof(*req)); + req->wr_hi = htonl(V_WR_OP(FW_WROPCODE_OFLD_TX_DATA)); + req->wr_lo = htonl(V_WR_TID(c3cn->tid)); + req->sndseq = htonl(c3cn->snd_nxt); + /* len includes the length of any HW ULP additions */ + req->len = htonl(len); + req->param = htonl(V_TX_PORT(c3cn->l2t->smt_idx)); + /* V_TX_ULP_SUBMODE sets both the mode and submode */ + req->flags = htonl(V_TX_ULP_SUBMODE(skb_ulp_mode(skb)) | + V_TX_SHOVE((skb_peek(&c3cn->write_queue) ? 0 : 1))); + + if (!c3cn_flag(c3cn, C3CN_TX_DATA_SENT)) { + req->flags |= htonl(V_TX_ACK_PAGES(2) | F_TX_INIT | + V_TX_CPU_IDX(c3cn->qset)); + /* Sendbuffer is in units of 32KB. */ + req->param |= htonl(V_TX_SNDBUF(cxgb3_snd_win >> 15)); + c3cn_set_flag(c3cn, C3CN_TX_DATA_SENT); + } +} + +/** + * c3cn_push_tx_frames -- start transmit + * @c3cn: the offloaded connection + * @req_completion: request wr_ack or not + * + * Prepends TX_DATA_WR or CPL_CLOSE_CON_REQ headers to buffers waiting in a + * connection's send queue and sends them on to T3. Must be called with the + * connection's lock held. Returns the amount of send buffer space that was + * freed as a result of sending queued data to T3. + */ +static void arp_failure_discard(struct t3cdev *cdev, struct sk_buff *skb) +{ + kfree_skb(skb); +} + +static int c3cn_push_tx_frames(struct s3_conn *c3cn, int req_completion) +{ + int total_size = 0; + struct sk_buff *skb; + struct t3cdev *cdev; + struct cxgb3i_sdev_data *cdata; + + if (unlikely(c3cn->state == C3CN_STATE_CONNECTING || + c3cn->state == C3CN_STATE_CLOSE_WAIT_1 || + c3cn->state == C3CN_STATE_ABORTING)) { + c3cn_tx_debug("c3cn 0x%p, in closing state %u.\n", + c3cn, c3cn->state); + return 0; + } + + cdev = c3cn->cdev; + cdata = CXGB3_SDEV_DATA(cdev); + + while (c3cn->wr_avail + && (skb = skb_peek(&c3cn->write_queue)) != NULL) { + int len = skb->len; /* length before skb_push */ + int frags = skb_shinfo(skb)->nr_frags + (len != skb->data_len); + int wrs_needed = skb_wrs[frags]; + + if (wrs_needed > 1 && len + sizeof(struct tx_data_wr) <= wrlen) + wrs_needed = 1; + + WARN_ON(frags >= SKB_WR_LIST_SIZE || wrs_needed < 1); + + if (c3cn->wr_avail < wrs_needed) { + c3cn_tx_debug("c3cn 0x%p, skb len %u/%u, frag %u, " + "wr %d < %u.\n", + c3cn, skb->len, skb->datalen, frags, + wrs_needed, c3cn->wr_avail); + break; + } + + __skb_unlink(skb, &c3cn->write_queue); + skb->priority = CPL_PRIORITY_DATA; + skb->csum = wrs_needed; /* remember this until the WR_ACK */ + c3cn->wr_avail -= wrs_needed; + c3cn->wr_unacked += wrs_needed; + enqueue_wr(c3cn, skb); + + if (likely(CXGB3_SKB_CB(skb)->flags & C3CB_FLAG_NEED_HDR)) { + len += ulp_extra_len(skb); + make_tx_data_wr(c3cn, skb, len); + c3cn->snd_nxt += len; + if ((req_completion + && c3cn->wr_unacked == wrs_needed) + || (CXGB3_SKB_CB(skb)->flags & C3CB_FLAG_COMPL) + || c3cn->wr_unacked >= c3cn->wr_max / 2) { + struct work_request_hdr *wr = cplhdr(skb); + + wr->wr_hi |= htonl(F_WR_COMPL); + c3cn->wr_unacked = 0; + } + CXGB3_SKB_CB(skb)->flags &= ~C3CB_FLAG_NEED_HDR; + } + + total_size += skb->truesize; + set_arp_failure_handler(skb, arp_failure_discard); + l2t_send(cdev, skb, c3cn->l2t); + } + return total_size; +} + +/* + * process_cpl_msg: -> host + * Top-level CPL message processing used by most CPL messages that + * pertain to connections. + */ +static inline void process_cpl_msg(void (*fn)(struct s3_conn *, + struct sk_buff *), + struct s3_conn *c3cn, + struct sk_buff *skb) +{ + spin_lock_bh(&c3cn->lock); + fn(c3cn, skb); + spin_unlock_bh(&c3cn->lock); +} + +/* + * process_cpl_msg_ref: -> host + * Similar to process_cpl_msg() but takes an extra connection reference around + * the call to the handler. Should be used if the handler may drop a + * connection reference. + */ +static inline void process_cpl_msg_ref(void (*fn) (struct s3_conn *, + struct sk_buff *), + struct s3_conn *c3cn, + struct sk_buff *skb) +{ + c3cn_hold(c3cn); + process_cpl_msg(fn, c3cn, skb); + c3cn_put(c3cn); +} + +/* + * Process a CPL_ACT_ESTABLISH message: -> host + * Updates connection state from an active establish CPL message. Runs with + * the connection lock held. + */ + +static inline void s3_free_atid(struct t3cdev *cdev, unsigned int tid) +{ + struct s3_conn *c3cn = cxgb3_free_atid(cdev, tid); + if (c3cn) + c3cn_put(c3cn); +} + +static void c3cn_established(struct s3_conn *c3cn, u32 snd_isn, + unsigned int opt) +{ + c3cn_conn_debug("c3cn 0x%p, state %u.\n", c3cn, c3cn->state); + + c3cn->write_seq = c3cn->snd_nxt = c3cn->snd_una = snd_isn; + + /* + * Causes the first RX_DATA_ACK to supply any Rx credits we couldn't + * pass through opt0. + */ + if (cxgb3_rcv_win > (M_RCV_BUFSIZ << 10)) + c3cn->rcv_wup -= cxgb3_rcv_win - (M_RCV_BUFSIZ << 10); + + dst_confirm(c3cn->dst_cache); + + smp_mb(); + + c3cn_set_state(c3cn, C3CN_STATE_ESTABLISHED); +} + +static void process_act_establish(struct s3_conn *c3cn, struct sk_buff *skb) +{ + struct cpl_act_establish *req = cplhdr(skb); + u32 rcv_isn = ntohl(req->rcv_isn); /* real RCV_ISN + 1 */ + + c3cn_conn_debug("c3cn 0x%p, state %u, flag 0x%lx.\n", + c3cn, c3cn->state, c3cn->flags); + + if (unlikely(c3cn->state != C3CN_STATE_CONNECTING)) + cxgb3i_log_error("TID %u expected SYN_SENT, got EST., s %u\n", + c3cn->tid, c3cn->state); + + c3cn->copied_seq = c3cn->rcv_wup = c3cn->rcv_nxt = rcv_isn; + c3cn_established(c3cn, ntohl(req->snd_isn), ntohs(req->tcp_opt)); + + __kfree_skb(skb); + + if (unlikely(c3cn_flag(c3cn, C3CN_ACTIVE_CLOSE_NEEDED))) + /* upper layer has requested closing */ + send_abort_req(c3cn); + else if (c3cn_push_tx_frames(c3cn, 1)) + cxgb3i_conn_tx_open(c3cn); +} + +static int do_act_establish(struct t3cdev *cdev, struct sk_buff *skb, + void *ctx) +{ + struct cpl_act_establish *req = cplhdr(skb); + unsigned int tid = GET_TID(req); + unsigned int atid = G_PASS_OPEN_TID(ntohl(req->tos_tid)); + struct s3_conn *c3cn = ctx; + struct cxgb3i_sdev_data *cdata = CXGB3_SDEV_DATA(cdev); + + c3cn_conn_debug("rcv, tid 0x%x, c3cn 0x%p, s %u, f 0x%lx.\n", + tid, c3cn, c3cn->state, c3cn->flags); + + c3cn->tid = tid; + c3cn_hold(c3cn); + cxgb3_insert_tid(cdata->cdev, cdata->client, c3cn, tid); + s3_free_atid(cdev, atid); + + c3cn->qset = G_QNUM(ntohl(skb->csum)); + + process_cpl_msg(process_act_establish, c3cn, skb); + return 0; +} + +/* + * Process a CPL_ACT_OPEN_RPL message: -> host + * Handle active open failures. + */ +static int act_open_rpl_status_to_errno(int status) +{ + switch (status) { + case CPL_ERR_CONN_RESET: + return ECONNREFUSED; + case CPL_ERR_ARP_MISS: + return EHOSTUNREACH; + case CPL_ERR_CONN_TIMEDOUT: + return ETIMEDOUT; + case CPL_ERR_TCAM_FULL: + return ENOMEM; + case CPL_ERR_CONN_EXIST: + cxgb3i_log_error("ACTIVE_OPEN_RPL: 4-tuple in use\n"); + return EADDRINUSE; + default: + return EIO; + } +} + +static void act_open_retry_timer(unsigned long data) +{ + struct sk_buff *skb; + struct s3_conn *c3cn = (struct s3_conn *)data; + + c3cn_conn_debug("c3cn 0x%p, state %u.\n", c3cn, c3cn->state); + + spin_lock_bh(&c3cn->lock); + skb = alloc_skb(sizeof(struct cpl_act_open_req), GFP_ATOMIC); + if (!skb) + fail_act_open(c3cn, ENOMEM); + else { + skb->sk = (struct sock *)c3cn; + set_arp_failure_handler(skb, act_open_req_arp_failure); + make_act_open_req(c3cn, skb, c3cn->tid, c3cn->l2t); + l2t_send(c3cn->cdev, skb, c3cn->l2t); + } + spin_unlock_bh(&c3cn->lock); + c3cn_put(c3cn); +} + +static void process_act_open_rpl(struct s3_conn *c3cn, struct sk_buff *skb) +{ + struct cpl_act_open_rpl *rpl = cplhdr(skb); + + c3cn_conn_debug("c3cn 0x%p, state %u, flag 0x%lx.\n", + c3cn, c3cn->state, c3cn->flags); + + if (rpl->status == CPL_ERR_CONN_EXIST && + c3cn->retry_timer.function != act_open_retry_timer) { + c3cn->retry_timer.function = act_open_retry_timer; + if (!mod_timer(&c3cn->retry_timer, jiffies + HZ / 2)) + c3cn_hold(c3cn); + } else + fail_act_open(c3cn, act_open_rpl_status_to_errno(rpl->status)); + __kfree_skb(skb); +} + +static int do_act_open_rpl(struct t3cdev *cdev, struct sk_buff *skb, void *ctx) +{ + struct s3_conn *c3cn = ctx; + struct cpl_act_open_rpl *rpl = cplhdr(skb); + + c3cn_conn_debug("rcv, status 0x%x, c3cn 0x%p, s %u, f 0x%lx.\n", + rpl->status, c3cn, c3cn->state, c3cn->flags); + + if (rpl->status != CPL_ERR_TCAM_FULL && + rpl->status != CPL_ERR_CONN_EXIST && + rpl->status != CPL_ERR_ARP_MISS) + cxgb3_queue_tid_release(cdev, GET_TID(rpl)); + + process_cpl_msg_ref(process_act_open_rpl, c3cn, skb); + return 0; +} + +/* + * Process PEER_CLOSE CPL messages: -> host + * Handle peer FIN. + */ +static void process_peer_close(struct s3_conn *c3cn, struct sk_buff *skb) +{ + c3cn_conn_debug("c3cn 0x%p, state %u, flag 0x%lx.\n", + c3cn, c3cn->state, c3cn->flags); + + if (c3cn_flag(c3cn, C3CN_ABORT_RPL_PENDING)) + goto out; + + switch (c3cn->state) { + case C3CN_STATE_ESTABLISHED: + c3cn_set_state(c3cn, C3CN_STATE_PASSIVE_CLOSE); + break; + case C3CN_STATE_ACTIVE_CLOSE: + c3cn_set_state(c3cn, C3CN_STATE_CLOSE_WAIT_2); + break; + case C3CN_STATE_CLOSE_WAIT_1: + c3cn_closed(c3cn); + break; + case C3CN_STATE_ABORTING: + break; + default: + cxgb3i_log_error("%s: peer close, TID %u in bad state %u\n", + c3cn->cdev->name, c3cn->tid, c3cn->state); + } + + cxgb3i_conn_closing(c3cn); +out: + __kfree_skb(skb); +} + +static int do_peer_close(struct t3cdev *cdev, struct sk_buff *skb, void *ctx) +{ + struct s3_conn *c3cn = ctx; + + c3cn_conn_debug("rcv, c3cn 0x%p, s %u, f 0x%lx.\n", + c3cn, c3cn->state, c3cn->flags); + process_cpl_msg_ref(process_peer_close, c3cn, skb); + return 0; +} + +/* + * Process CLOSE_CONN_RPL CPL message: -> host + * Process a peer ACK to our FIN. + */ +static void process_close_con_rpl(struct s3_conn *c3cn, struct sk_buff *skb) +{ + struct cpl_close_con_rpl *rpl = cplhdr(skb); + + c3cn_conn_debug("c3cn 0x%p, state %u, flag 0x%lx.\n", + c3cn, c3cn->state, c3cn->flags); + + c3cn->snd_una = ntohl(rpl->snd_nxt) - 1; /* exclude FIN */ + + if (c3cn_flag(c3cn, C3CN_ABORT_RPL_PENDING)) + goto out; + + switch (c3cn->state) { + case C3CN_STATE_ACTIVE_CLOSE: + c3cn_set_state(c3cn, C3CN_STATE_CLOSE_WAIT_1); + break; + case C3CN_STATE_CLOSE_WAIT_1: + case C3CN_STATE_CLOSE_WAIT_2: + c3cn_closed(c3cn); + break; + case C3CN_STATE_ABORTING: + break; + default: + cxgb3i_log_error("%s: close_rpl, TID %u in bad state %u\n", + c3cn->cdev->name, c3cn->tid, c3cn->state); + } + +out: + kfree_skb(skb); +} + +static int do_close_con_rpl(struct t3cdev *cdev, struct sk_buff *skb, + void *ctx) +{ + struct s3_conn *c3cn = ctx; + + c3cn_conn_debug("rcv, c3cn 0x%p, s %u, f 0x%lx.\n", + c3cn, c3cn->state, c3cn->flags); + + process_cpl_msg_ref(process_close_con_rpl, c3cn, skb); + return 0; +} + +/* + * Process ABORT_REQ_RSS CPL message: -> host + * Process abort requests. If we are waiting for an ABORT_RPL we ignore this + * request except that we need to reply to it. + */ + +static int abort_status_to_errno(struct s3_conn *c3cn, int abort_reason, + int *need_rst) +{ + switch (abort_reason) { + case CPL_ERR_BAD_SYN: /* fall through */ + case CPL_ERR_CONN_RESET: + return c3cn->state > C3CN_STATE_ESTABLISHED ? + EPIPE : ECONNRESET; + case CPL_ERR_XMIT_TIMEDOUT: + case CPL_ERR_PERSIST_TIMEDOUT: + case CPL_ERR_FINWAIT2_TIMEDOUT: + case CPL_ERR_KEEPALIVE_TIMEDOUT: + return ETIMEDOUT; + default: + return EIO; + } +} + +static void process_abort_req(struct s3_conn *c3cn, struct sk_buff *skb) +{ + int rst_status = CPL_ABORT_NO_RST; + const struct cpl_abort_req_rss *req = cplhdr(skb); + + c3cn_conn_debug("c3cn 0x%p, state %u, flag 0x%lx.\n", + c3cn, c3cn->state, c3cn->flags); + + if (!c3cn_flag(c3cn, C3CN_ABORT_REQ_RCVD)) { + c3cn_set_flag(c3cn, C3CN_ABORT_REQ_RCVD); + c3cn_set_state(c3cn, C3CN_STATE_ABORTING); + __kfree_skb(skb); + return; + } + + c3cn_clear_flag(c3cn, C3CN_ABORT_REQ_RCVD); + send_abort_rpl(c3cn, rst_status); + + if (!c3cn_flag(c3cn, C3CN_ABORT_RPL_PENDING)) { + c3cn->err = + abort_status_to_errno(c3cn, req->status, &rst_status); + c3cn_closed(c3cn); + } +} + +static int do_abort_req(struct t3cdev *cdev, struct sk_buff *skb, void *ctx) +{ + const struct cpl_abort_req_rss *req = cplhdr(skb); + struct s3_conn *c3cn = ctx; + + c3cn_conn_debug("rcv, c3cn 0x%p, s 0x%x, f 0x%lx.\n", + c3cn, c3cn->state, c3cn->flags); + + if (req->status == CPL_ERR_RTX_NEG_ADVICE || + req->status == CPL_ERR_PERSIST_NEG_ADVICE) { + __kfree_skb(skb); + return 0; + } + + process_cpl_msg_ref(process_abort_req, c3cn, skb); + return 0; +} + +/* + * Process ABORT_RPL_RSS CPL message: -> host + * Process abort replies. We only process these messages if we anticipate + * them as the coordination between SW and HW in this area is somewhat lacking + * and sometimes we get ABORT_RPLs after we are done with the connection that + * originated the ABORT_REQ. + */ +static void process_abort_rpl(struct s3_conn *c3cn, struct sk_buff *skb) +{ + c3cn_conn_debug("c3cn 0x%p, state %u, flag 0x%lx.\n", + c3cn, c3cn->state, c3cn->flags); + + if (c3cn_flag(c3cn, C3CN_ABORT_RPL_PENDING)) { + if (!c3cn_flag(c3cn, C3CN_ABORT_RPL_RCVD)) + c3cn_set_flag(c3cn, C3CN_ABORT_RPL_RCVD); + else { + c3cn_clear_flag(c3cn, C3CN_ABORT_RPL_RCVD); + c3cn_clear_flag(c3cn, C3CN_ABORT_RPL_PENDING); + if (c3cn_flag(c3cn, C3CN_ABORT_REQ_RCVD)) + cxgb3i_log_error("%s tid %u, ABORT_RPL_RSS\n", + c3cn->cdev->name, c3cn->tid); + c3cn_closed(c3cn); + } + } + __kfree_skb(skb); +} + +static int do_abort_rpl(struct t3cdev *cdev, struct sk_buff *skb, void *ctx) +{ + struct cpl_abort_rpl_rss *rpl = cplhdr(skb); + struct s3_conn *c3cn = ctx; + + c3cn_conn_debug("rcv, status 0x%x, c3cn 0x%p, s %u, 0x%lx.\n", + rpl->status, c3cn, c3cn ? c3cn->state : 0, + c3cn ? c3cn->flags : 0UL); + + /* + * Ignore replies to post-close aborts indicating that the abort was + * requested too late. These connections are terminated when we get + * PEER_CLOSE or CLOSE_CON_RPL and by the time the abort_rpl_rss + * arrives the TID is either no longer used or it has been recycled. + */ + if (rpl->status == CPL_ERR_ABORT_FAILED) + goto discard; + + /* + * Sometimes we've already closed the connection, e.g., a post-close + * abort races with ABORT_REQ_RSS, the latter frees the connection + * expecting the ABORT_REQ will fail with CPL_ERR_ABORT_FAILED, + * but FW turns the ABORT_REQ into a regular one and so we get + * ABORT_RPL_RSS with status 0 and no connection. + */ + if (!c3cn) + goto discard; + + process_cpl_msg_ref(process_abort_rpl, c3cn, skb); + return 0; + +discard: + __kfree_skb(skb); + return 0; +} + +/* + * Process RX_ISCSI_HDR CPL message: -> host + * Handle received PDUs, the payload could be DDP'ed. If not, the payload + * follow after the bhs. + */ +static void process_rx_iscsi_hdr(struct s3_conn *c3cn, struct sk_buff *skb) +{ + struct cpl_iscsi_hdr *hdr_cpl = cplhdr(skb); + struct cpl_iscsi_hdr_norss data_cpl; + struct cpl_rx_data_ddp_norss ddp_cpl; + unsigned int hdr_len, data_len, status; + unsigned int len; + int err; + + if (unlikely(c3cn->state >= C3CN_STATE_PASSIVE_CLOSE)) { + if (c3cn->state != C3CN_STATE_ABORTING) + send_abort_req(c3cn); + __kfree_skb(skb); + return; + } + + CXGB3_SKB_CB(skb)->seq = ntohl(hdr_cpl->seq); + CXGB3_SKB_CB(skb)->flags = 0; + + skb_reset_transport_header(skb); + __skb_pull(skb, sizeof(struct cpl_iscsi_hdr)); + + len = hdr_len = ntohs(hdr_cpl->len); + /* msg coalesce is off or not enough data received */ + if (skb->len <= hdr_len) { + cxgb3i_log_error("%s: TID %u, ISCSI_HDR, skb len %u < %u.\n", + c3cn->cdev->name, c3cn->tid, + skb->len, hdr_len); + goto abort_conn; + } + + err = skb_copy_bits(skb, skb->len - sizeof(ddp_cpl), &ddp_cpl, + sizeof(ddp_cpl)); + if (err < 0) + goto abort_conn; + + skb_ulp_mode(skb) = ULP2_FLAG_DATA_READY; + skb_ulp_pdulen(skb) = ntohs(ddp_cpl.len); + skb_ulp_ddigest(skb) = ntohl(ddp_cpl.ulp_crc); + status = ntohl(ddp_cpl.ddp_status); + + c3cn_rx_debug("rx skb 0x%p, len %u, pdulen %u, ddp status 0x%x.\n", + skb, skb->len, skb_ulp_pdulen(skb), status); + + if (status & (1 << RX_DDP_STATUS_HCRC_SHIFT)) + skb_ulp_mode(skb) |= ULP2_FLAG_HCRC_ERROR; + if (status & (1 << RX_DDP_STATUS_DCRC_SHIFT)) + skb_ulp_mode(skb) |= ULP2_FLAG_DCRC_ERROR; + if (status & (1 << RX_DDP_STATUS_PAD_SHIFT)) + skb_ulp_mode(skb) |= ULP2_FLAG_PAD_ERROR; + + if (skb->len > (hdr_len + sizeof(ddp_cpl))) { + err = skb_copy_bits(skb, hdr_len, &data_cpl, sizeof(data_cpl)); + if (err < 0) + goto abort_conn; + data_len = ntohs(data_cpl.len); + len += sizeof(data_cpl) + data_len; + } else if (status & (1 << RX_DDP_STATUS_DDP_SHIFT)) + skb_ulp_mode(skb) |= ULP2_FLAG_DATA_DDPED; + + c3cn->rcv_nxt = ntohl(ddp_cpl.seq) + skb_ulp_pdulen(skb); + __pskb_trim(skb, len); + __skb_queue_tail(&c3cn->receive_queue, skb); + cxgb3i_conn_pdu_ready(c3cn); + + return; + +abort_conn: + send_abort_req(c3cn); + __kfree_skb(skb); +} + +static int do_iscsi_hdr(struct t3cdev *t3dev, struct sk_buff *skb, void *ctx) +{ + struct s3_conn *c3cn = ctx; + + process_cpl_msg(process_rx_iscsi_hdr, c3cn, skb); + return 0; +} + +/* + * Process TX_DATA_ACK CPL messages: -> host + * Process an acknowledgment of WR completion. Advance snd_una and send the + * next batch of work requests from the write queue. + */ +static void process_wr_ack(struct s3_conn *c3cn, struct sk_buff *skb) +{ + struct cpl_wr_ack *hdr = cplhdr(skb); + unsigned int credits = ntohs(hdr->credits); + u32 snd_una = ntohl(hdr->snd_una); + + c3cn->wr_avail += credits; + if (c3cn->wr_unacked > c3cn->wr_max - c3cn->wr_avail) + c3cn->wr_unacked = c3cn->wr_max - c3cn->wr_avail; + + while (credits) { + struct sk_buff *p = peek_wr(c3cn); + + if (unlikely(!p)) { + cxgb3i_log_error("%u WR_ACK credits for TID %u with " + "nothing pending, state %u\n", + credits, c3cn->tid, c3cn->state); + break; + } + if (unlikely(credits < p->csum)) { + p->csum -= credits; + break; + } else { + dequeue_wr(c3cn); + credits -= p->csum; + free_wr_skb(p); + } + } + + if (unlikely(before(snd_una, c3cn->snd_una))) + goto out_free; + + if (c3cn->snd_una != snd_una) { + c3cn->snd_una = snd_una; + dst_confirm(c3cn->dst_cache); + } + + if (skb_queue_len(&c3cn->write_queue) && c3cn_push_tx_frames(c3cn, 0)) + cxgb3i_conn_tx_open(c3cn); +out_free: + __kfree_skb(skb); +} + +static int do_wr_ack(struct t3cdev *cdev, struct sk_buff *skb, void *ctx) +{ + struct s3_conn *c3cn = ctx; + + process_cpl_msg(process_wr_ack, c3cn, skb); + return 0; +} + +/* + * for each connection, pre-allocate skbs needed for close/abort requests. So + * that we can service the request right away. + */ +static void c3cn_free_cpl_skbs(struct s3_conn *c3cn) +{ + if (c3cn->cpl_close) + kfree_skb(c3cn->cpl_close); + if (c3cn->cpl_abort_req) + kfree_skb(c3cn->cpl_abort_req); + if (c3cn->cpl_abort_rpl) + kfree_skb(c3cn->cpl_abort_rpl); +} + +static int c3cn_alloc_cpl_skbs(struct s3_conn *c3cn) +{ + c3cn->cpl_close = alloc_skb(sizeof(struct cpl_close_con_req), + GFP_KERNEL); + if (!c3cn->cpl_close) + return -ENOMEM; + skb_put(c3cn->cpl_close, sizeof(struct cpl_close_con_req)); + + c3cn->cpl_abort_req = alloc_skb(sizeof(struct cpl_abort_req), + GFP_KERNEL); + if (!c3cn->cpl_abort_req) + goto free_cpl_skbs; + skb_put(c3cn->cpl_abort_req, sizeof(struct cpl_abort_req)); + + c3cn->cpl_abort_rpl = alloc_skb(sizeof(struct cpl_abort_rpl), + GFP_KERNEL); + if (!c3cn->cpl_abort_rpl) + goto free_cpl_skbs; + skb_put(c3cn->cpl_abort_rpl, sizeof(struct cpl_abort_rpl)); + + return 0; + +free_cpl_skbs: + c3cn_free_cpl_skbs(c3cn); + return -ENOMEM; +} + +/** + * c3cn_release_offload_resources - release offload resource + * @c3cn: the offloaded iscsi tcp connection. + * Release resources held by an offload connection (TID, L2T entry, etc.) + */ +static void c3cn_release_offload_resources(struct s3_conn *c3cn) +{ + struct t3cdev *cdev = c3cn->cdev; + unsigned int tid = c3cn->tid; + + if (!cdev) + return; + + c3cn->qset = 0; + + c3cn_free_cpl_skbs(c3cn); + + if (c3cn->wr_avail != c3cn->wr_max) { + purge_wr_queue(c3cn); + reset_wr_list(c3cn); + } + + if (c3cn->l2t) { + l2t_release(L2DATA(cdev), c3cn->l2t); + c3cn->l2t = NULL; + } + + if (c3cn->state == C3CN_STATE_CONNECTING) /* we have ATID */ + s3_free_atid(cdev, tid); + else { /* we have TID */ + cxgb3_remove_tid(cdev, (void *)c3cn, tid); + c3cn_put(c3cn); + } + + c3cn->cdev = NULL; +} + +/** + * cxgb3i_c3cn_create - allocate and initialize an s3_conn structure + * returns the s3_conn structure allocated. + */ +struct s3_conn *cxgb3i_c3cn_create(void) +{ + struct s3_conn *c3cn; + + c3cn = kzalloc(sizeof(*c3cn), GFP_KERNEL); + if (!c3cn) + return NULL; + + /* pre-allocate close/abort cpl, so we don't need to wait for memory + when close/abort is requested. */ + if (c3cn_alloc_cpl_skbs(c3cn) < 0) + goto free_c3cn; + + c3cn_conn_debug("alloc c3cn 0x%p.\n", c3cn); + + c3cn->flags = 0; + spin_lock_init(&c3cn->lock); + atomic_set(&c3cn->refcnt, 1); + skb_queue_head_init(&c3cn->receive_queue); + skb_queue_head_init(&c3cn->write_queue); + setup_timer(&c3cn->retry_timer, NULL, (unsigned long)c3cn); + rwlock_init(&c3cn->callback_lock); + + return c3cn; + +free_c3cn: + kfree(c3cn); + return NULL; +} + +static void c3cn_active_close(struct s3_conn *c3cn) +{ + int data_lost; + int close_req = 0; + + c3cn_conn_debug("c3cn 0x%p, state %u, flag 0x%lx.\n", + c3cn, c3cn->state, c3cn->flags); + + dst_confirm(c3cn->dst_cache); + + c3cn_hold(c3cn); + spin_lock_bh(&c3cn->lock); + + data_lost = skb_queue_len(&c3cn->receive_queue); + __skb_queue_purge(&c3cn->receive_queue); + + switch (c3cn->state) { + case C3CN_STATE_CLOSED: + case C3CN_STATE_ACTIVE_CLOSE: + case C3CN_STATE_CLOSE_WAIT_1: + case C3CN_STATE_CLOSE_WAIT_2: + case C3CN_STATE_ABORTING: + /* nothing need to be done */ + break; + case C3CN_STATE_CONNECTING: + /* defer until cpl_act_open_rpl or cpl_act_establish */ + c3cn_set_flag(c3cn, C3CN_ACTIVE_CLOSE_NEEDED); + break; + case C3CN_STATE_ESTABLISHED: + close_req = 1; + c3cn_set_state(c3cn, C3CN_STATE_ACTIVE_CLOSE); + break; + case C3CN_STATE_PASSIVE_CLOSE: + close_req = 1; + c3cn_set_state(c3cn, C3CN_STATE_CLOSE_WAIT_2); + break; + } + + if (close_req) { + if (data_lost) + /* Unread data was tossed, zap the connection. */ + send_abort_req(c3cn); + else + send_close_req(c3cn); + } + + spin_unlock_bh(&c3cn->lock); + c3cn_put(c3cn); +} + +/** + * cxgb3i_c3cn_release - close and release an iscsi tcp connection and any + * resource held + * @c3cn: the iscsi tcp connection + */ +void cxgb3i_c3cn_release(struct s3_conn *c3cn) +{ + c3cn_conn_debug("c3cn 0x%p, s %u, f 0x%lx.\n", + c3cn, c3cn->state, c3cn->flags); + if (likely(c3cn->state != C3CN_STATE_CONNECTING)) + c3cn_active_close(c3cn); + else + c3cn_set_flag(c3cn, C3CN_ACTIVE_CLOSE_NEEDED); + c3cn_put(c3cn); +} + +static int is_cxgb3_dev(struct net_device *dev) +{ + struct cxgb3i_sdev_data *cdata; + + write_lock(&cdata_rwlock); + list_for_each_entry(cdata, &cdata_list, list) { + struct adap_ports *ports = &cdata->ports; + int i; + + for (i = 0; i < ports->nports; i++) + if (dev == ports->lldevs[i]) { + write_unlock(&cdata_rwlock); + return 1; + } + } + write_unlock(&cdata_rwlock); + return 0; +} + +/** + * cxgb3_egress_dev - return the cxgb3 egress device + * @root_dev: the root device anchoring the search + * @c3cn: the connection used to determine egress port in bonding mode + * @context: in bonding mode, indicates a connection set up or failover + * + * Return egress device or NULL if the egress device isn't one of our ports. + */ +static struct net_device *cxgb3_egress_dev(struct net_device *root_dev, + struct s3_conn *c3cn, + int context) +{ + while (root_dev) { + if (root_dev->priv_flags & IFF_802_1Q_VLAN) + root_dev = vlan_dev_real_dev(root_dev); + else if (is_cxgb3_dev(root_dev)) + return root_dev; + else + return NULL; + } + return NULL; +} + +static struct rtable *find_route(__be32 saddr, __be32 daddr, + __be16 sport, __be16 dport) +{ + struct rtable *rt; + struct flowi fl = { + .oif = 0, + .nl_u = { + .ip4_u = { + .daddr = daddr, + .saddr = saddr, + .tos = 0 } }, + .proto = IPPROTO_TCP, + .uli_u = { + .ports = { + .sport = sport, + .dport = dport } } }; + + if (ip_route_output_flow(&init_net, &rt, &fl, NULL, 0)) + return NULL; + return rt; +} + +/* + * Assign offload parameters to some connection fields. + */ +static void init_offload_conn(struct s3_conn *c3cn, + struct t3cdev *cdev, + struct dst_entry *dst) +{ + BUG_ON(c3cn->cdev != cdev); + c3cn->wr_max = c3cn->wr_avail = T3C_DATA(cdev)->max_wrs; + c3cn->wr_unacked = 0; + c3cn->mss_idx = select_mss(c3cn, dst_mtu(dst)); + + reset_wr_list(c3cn); +} + +static int initiate_act_open(struct s3_conn *c3cn, struct net_device *dev) +{ + struct cxgb3i_sdev_data *cdata = NDEV2CDATA(dev); + struct t3cdev *cdev = cdata->cdev; + struct dst_entry *dst = c3cn->dst_cache; + struct sk_buff *skb; + + c3cn_conn_debug("c3cn 0x%p, state %u, flag 0x%lx.\n", + c3cn, c3cn->state, c3cn->flags); + /* + * Initialize connection data. Note that the flags and ULP mode are + * initialized higher up ... + */ + c3cn->dev = dev; + c3cn->cdev = cdev; + c3cn->tid = cxgb3_alloc_atid(cdev, cdata->client, c3cn); + if (c3cn->tid < 0) + goto out_err; + + c3cn->qset = 0; + c3cn->l2t = t3_l2t_get(cdev, dst->neighbour, dev); + if (!c3cn->l2t) + goto free_tid; + + skb = alloc_skb(sizeof(struct cpl_act_open_req), GFP_KERNEL); + if (!skb) + goto free_l2t; + + skb->sk = (struct sock *)c3cn; + set_arp_failure_handler(skb, act_open_req_arp_failure); + + c3cn_hold(c3cn); + + init_offload_conn(c3cn, cdev, dst); + c3cn->err = 0; + + make_act_open_req(c3cn, skb, c3cn->tid, c3cn->l2t); + l2t_send(cdev, skb, c3cn->l2t); + return 0; + +free_l2t: + l2t_release(L2DATA(cdev), c3cn->l2t); +free_tid: + s3_free_atid(cdev, c3cn->tid); + c3cn->tid = 0; +out_err: + return -1; +} + + +/** + * cxgb3i_c3cn_connect - initiates an iscsi tcp connection to a given address + * @c3cn: the iscsi tcp connection + * @usin: destination address + * + * return 0 if active open request is sent, < 0 otherwise. + */ +int cxgb3i_c3cn_connect(struct s3_conn *c3cn, struct sockaddr_in *usin) +{ + struct rtable *rt; + struct net_device *dev; + struct cxgb3i_sdev_data *cdata; + struct t3cdev *cdev; + __be32 sipv4; + int err; + + if (usin->sin_family != AF_INET) + return -EAFNOSUPPORT; + + c3cn->daddr.sin_port = usin->sin_port; + c3cn->daddr.sin_addr.s_addr = usin->sin_addr.s_addr; + + rt = find_route(c3cn->saddr.sin_addr.s_addr, + c3cn->daddr.sin_addr.s_addr, + c3cn->saddr.sin_port, + c3cn->daddr.sin_port); + if (rt == NULL) { + c3cn_conn_debug("NO route to 0x%x, port %u.\n", + c3cn->daddr.sin_addr.s_addr, + ntohs(c3cn->daddr.sin_port)); + return -ENETUNREACH; + } + + if (rt->rt_flags & (RTCF_MULTICAST | RTCF_BROADCAST)) { + c3cn_conn_debug("multi-cast route to 0x%x, port %u.\n", + c3cn->daddr.sin_addr.s_addr, + ntohs(c3cn->daddr.sin_port)); + ip_rt_put(rt); + return -ENETUNREACH; + } + + if (!c3cn->saddr.sin_addr.s_addr) + c3cn->saddr.sin_addr.s_addr = rt->rt_src; + + /* now commit destination to connection */ + c3cn->dst_cache = &rt->u.dst; + + /* try to establish an offloaded connection */ + dev = cxgb3_egress_dev(c3cn->dst_cache->dev, c3cn, 0); + if (dev == NULL) { + c3cn_conn_debug("c3cn 0x%p, egress dev NULL.\n", c3cn); + return -ENETUNREACH; + } + cdata = NDEV2CDATA(dev); + cdev = cdata->cdev; + + /* get a source port if one hasn't been provided */ + err = c3cn_get_port(c3cn, cdata); + if (err) + return err; + + c3cn_conn_debug("c3cn 0x%p get port %u.\n", + c3cn, ntohs(c3cn->saddr.sin_port)); + + sipv4 = cxgb3i_get_private_ipv4addr(dev); + if (!sipv4) { + c3cn_conn_debug("c3cn 0x%p, iscsi ip not configured.\n", c3cn); + sipv4 = c3cn->saddr.sin_addr.s_addr; + cxgb3i_set_private_ipv4addr(dev, sipv4); + } else + c3cn->saddr.sin_addr.s_addr = sipv4; + + c3cn_conn_debug("c3cn 0x%p, %u.%u.%u.%u,%u-%u.%u.%u.%u,%u SYN_SENT.\n", + c3cn, NIPQUAD(c3cn->saddr.sin_addr.s_addr), + ntohs(c3cn->saddr.sin_port), + NIPQUAD(c3cn->daddr.sin_addr.s_addr), + ntohs(c3cn->daddr.sin_port)); + + c3cn_set_state(c3cn, C3CN_STATE_CONNECTING); + if (!initiate_act_open(c3cn, dev)) + return 0; + + /* + * If we get here, we don't have an offload connection so simply + * return a failure. + */ + err = -ENOTSUPP; + + /* + * This trashes the connection and releases the local port, + * if necessary. + */ + c3cn_conn_debug("c3cn 0x%p -> CLOSED.\n", c3cn); + c3cn_set_state(c3cn, C3CN_STATE_CLOSED); + ip_rt_put(rt); + c3cn_put_port(c3cn); + c3cn->daddr.sin_port = 0; + return err; +} + +/** + * cxgb3i_c3cn_rx_credits - ack received tcp data. + * @c3cn: iscsi tcp connection + * @copied: # of bytes processed + * + * Called after some received data has been read. It returns RX credits + * to the HW for the amount of data processed. + */ +void cxgb3i_c3cn_rx_credits(struct s3_conn *c3cn, int copied) +{ + struct t3cdev *cdev; + int must_send; + u32 credits, dack = 0; + + if (c3cn->state != C3CN_STATE_ESTABLISHED) + return; + + credits = c3cn->copied_seq - c3cn->rcv_wup; + if (unlikely(!credits)) + return; + + cdev = c3cn->cdev; + + if (unlikely(cxgb3_rx_credit_thres == 0)) + return; + + dack = F_RX_DACK_CHANGE | V_RX_DACK_MODE(1); + + /* + * For coalescing to work effectively ensure the receive window has + * at least 16KB left. + */ + must_send = credits + 16384 >= cxgb3_rcv_win; + + if (must_send || credits >= cxgb3_rx_credit_thres) + c3cn->rcv_wup += send_rx_credits(c3cn, credits, dack); +} + +/** + * cxgb3i_c3cn_send_pdus - send the skbs containing iscsi pdus + * @c3cn: iscsi tcp connection + * @skb: skb contains the iscsi pdu + * + * Add a list of skbs to a connection send queue. The skbs must comply with + * the max size limit of the device and have a headroom of at least + * TX_HEADER_LEN bytes. + * Return # of bytes queued. + */ +int cxgb3i_c3cn_send_pdus(struct s3_conn *c3cn, struct sk_buff *skb) +{ + struct sk_buff *next; + int err, copied = 0; + + spin_lock_bh(&c3cn->lock); + + if (c3cn->state != C3CN_STATE_ESTABLISHED) { + c3cn_tx_debug("c3cn 0x%p, not in est. state %u.\n", + c3cn, c3cn->state); + err = -EAGAIN; + goto out_err; + } + + err = -EPIPE; + if (c3cn->err) { + c3cn_tx_debug("c3cn 0x%p, err %d.\n", c3cn, c3cn->err); + goto out_err; + } + + while (skb) { + int frags = skb_shinfo(skb)->nr_frags + + (skb->len != skb->data_len); + + if (unlikely(skb_headroom(skb) < TX_HEADER_LEN)) { + c3cn_tx_debug("c3cn 0x%p, skb head.\n", c3cn); + err = -EINVAL; + goto out_err; + } + + if (frags >= SKB_WR_LIST_SIZE) { + cxgb3i_log_error("c3cn 0x%p, tx frags %d, len %u,%u.\n", + c3cn, skb_shinfo(skb)->nr_frags, + skb->len, skb->data_len); + err = -EINVAL; + goto out_err; + } + + next = skb->next; + skb->next = NULL; + skb_entail(c3cn, skb, C3CB_FLAG_NO_APPEND | C3CB_FLAG_NEED_HDR); + copied += skb->len; + c3cn->write_seq += skb->len + ulp_extra_len(skb); + skb = next; + } +done: + if (likely(skb_queue_len(&c3cn->write_queue))) + c3cn_push_tx_frames(c3cn, 1); + spin_unlock_bh(&c3cn->lock); + return copied; + +out_err: + if (copied == 0 && err == -EPIPE) + copied = c3cn->err ? c3cn->err : -EPIPE; + goto done; +} + +static void sdev_data_cleanup(struct cxgb3i_sdev_data *cdata) +{ + struct adap_ports *ports = &cdata->ports; + int i; + + for (i = 0; i < ports->nports; i++) + NDEV2CDATA(ports->lldevs[i]) = NULL; + cxgb3i_free_big_mem(cdata); +} + +void cxgb3i_sdev_cleanup(void) +{ + struct cxgb3i_sdev_data *cdata; + + write_lock(&cdata_rwlock); + list_for_each_entry(cdata, &cdata_list, list) { + list_del(&cdata->list); + sdev_data_cleanup(cdata); + } + write_unlock(&cdata_rwlock); +} + +int cxgb3i_sdev_init(cxgb3_cpl_handler_func *cpl_handlers) +{ + cpl_handlers[CPL_ACT_ESTABLISH] = do_act_establish; + cpl_handlers[CPL_ACT_OPEN_RPL] = do_act_open_rpl; + cpl_handlers[CPL_PEER_CLOSE] = do_peer_close; + cpl_handlers[CPL_ABORT_REQ_RSS] = do_abort_req; + cpl_handlers[CPL_ABORT_RPL_RSS] = do_abort_rpl; + cpl_handlers[CPL_CLOSE_CON_RPL] = do_close_con_rpl; + cpl_handlers[CPL_TX_DMA_ACK] = do_wr_ack; + cpl_handlers[CPL_ISCSI_HDR] = do_iscsi_hdr; + + if (cxgb3_max_connect > CXGB3I_MAX_CONN) + cxgb3_max_connect = CXGB3I_MAX_CONN; + return 0; +} + +/** + * cxgb3i_sdev_add - allocate and initialize resources for each adapter found + * @cdev: t3cdev adapter + * @client: cxgb3 driver client + */ +void cxgb3i_sdev_add(struct t3cdev *cdev, struct cxgb3_client *client) +{ + struct cxgb3i_sdev_data *cdata; + struct ofld_page_info rx_page_info; + unsigned int wr_len; + int mapsize = DIV_ROUND_UP(cxgb3_max_connect, + 8 * sizeof(unsigned long)); + int i; + + cdata = cxgb3i_alloc_big_mem(sizeof(*cdata) + mapsize, GFP_KERNEL); + if (!cdata) + return; + + if (cdev->ctl(cdev, GET_WR_LEN, &wr_len) < 0 || + cdev->ctl(cdev, GET_PORTS, &cdata->ports) < 0 || + cdev->ctl(cdev, GET_RX_PAGE_INFO, &rx_page_info) < 0) + goto free_cdata; + + s3_init_wr_tab(wr_len); + + INIT_LIST_HEAD(&cdata->list); + cdata->cdev = cdev; + cdata->client = client; + + for (i = 0; i < cdata->ports.nports; i++) + NDEV2CDATA(cdata->ports.lldevs[i]) = cdata; + + write_lock(&cdata_rwlock); + list_add_tail(&cdata->list, &cdata_list); + write_unlock(&cdata_rwlock); + + return; + +free_cdata: + cxgb3i_free_big_mem(cdata); +} + +/** + * cxgb3i_sdev_remove - free the allocated resources for the adapter + * @cdev: t3cdev adapter + */ +void cxgb3i_sdev_remove(struct t3cdev *cdev) +{ + struct cxgb3i_sdev_data *cdata = CXGB3_SDEV_DATA(cdev); + + write_lock(&cdata_rwlock); + list_del(&cdata->list); + write_unlock(&cdata_rwlock); + + sdev_data_cleanup(cdata); +} diff --git a/drivers/scsi/cxgb3i/cxgb3i_offload.h b/drivers/scsi/cxgb3i/cxgb3i_offload.h new file mode 100644 index 00000000000..5b93d629e5c --- /dev/null +++ b/drivers/scsi/cxgb3i/cxgb3i_offload.h @@ -0,0 +1,231 @@ +/* + * cxgb3i_offload.h: Chelsio S3xx iscsi offloaded tcp connection management + * + * Copyright (C) 2003-2008 Chelsio Communications. All rights reserved. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the LICENSE file included in this + * release for licensing terms and conditions. + * + * Written by: Dimitris Michailidis (dm@chelsio.com) + * Karen Xie (kxie@chelsio.com) + */ + +#ifndef _CXGB3I_OFFLOAD_H +#define _CXGB3I_OFFLOAD_H + +#include +#include + +#include "common.h" +#include "adapter.h" +#include "t3cdev.h" +#include "cxgb3_offload.h" + +#define cxgb3i_log_error(fmt...) printk(KERN_ERR "cxgb3i: ERR! " fmt) +#define cxgb3i_log_warn(fmt...) printk(KERN_WARNING "cxgb3i: WARN! " fmt) +#define cxgb3i_log_info(fmt...) printk(KERN_INFO "cxgb3i: " fmt) +#define cxgb3i_log_debug(fmt, args...) \ + printk(KERN_INFO "cxgb3i: %s - " fmt, __func__ , ## args) + +/** + * struct s3_conn - an iscsi tcp connection structure + * + * @dev: net device of with connection + * @cdev: adapter t3cdev for net device + * @flags: see c3cn_flags below + * @tid: connection id assigned by the h/w + * @qset: queue set used by connection + * @mss_idx: Maximum Segment Size table index + * @l2t: ARP resolution entry for offload packets + * @wr_max: maximum in-flight writes + * @wr_avail: number of writes available + * @wr_unacked: writes since last request for completion notification + * @wr_pending_head: head of pending write queue + * @wr_pending_tail: tail of pending write queue + * @cpl_close: skb for cpl_close_req + * @cpl_abort_req: skb for cpl_abort_req + * @cpl_abort_rpl: skb for cpl_abort_rpl + * @lock: connection status lock + * @refcnt: reference count on connection + * @state: connection state + * @saddr: source ip/port address + * @daddr: destination ip/port address + * @dst_cache: reference to destination route + * @receive_queue: received PDUs + * @write_queue: un-pushed pending writes + * @retry_timer: retry timer for various operations + * @err: connection error status + * @callback_lock: lock for opaque user context + * @user_data: opaque user context + * @rcv_nxt: next receive seq. # + * @copied_seq: head of yet unread data + * @rcv_wup: rcv_nxt on last window update sent + * @snd_nxt: next sequence we send + * @snd_una: first byte we want an ack for + * @write_seq: tail+1 of data held in send buffer + */ +struct s3_conn { + struct net_device *dev; + struct t3cdev *cdev; + unsigned long flags; + int tid; + int qset; + int mss_idx; + struct l2t_entry *l2t; + int wr_max; + int wr_avail; + int wr_unacked; + struct sk_buff *wr_pending_head; + struct sk_buff *wr_pending_tail; + struct sk_buff *cpl_close; + struct sk_buff *cpl_abort_req; + struct sk_buff *cpl_abort_rpl; + spinlock_t lock; + atomic_t refcnt; + volatile unsigned int state; + struct sockaddr_in saddr; + struct sockaddr_in daddr; + struct dst_entry *dst_cache; + struct sk_buff_head receive_queue; + struct sk_buff_head write_queue; + struct timer_list retry_timer; + int err; + rwlock_t callback_lock; + void *user_data; + + u32 rcv_nxt; + u32 copied_seq; + u32 rcv_wup; + u32 snd_nxt; + u32 snd_una; + u32 write_seq; +}; + +/* + * connection state + */ +enum conn_states { + C3CN_STATE_CONNECTING = 1, + C3CN_STATE_ESTABLISHED, + C3CN_STATE_ACTIVE_CLOSE, + C3CN_STATE_PASSIVE_CLOSE, + C3CN_STATE_CLOSE_WAIT_1, + C3CN_STATE_CLOSE_WAIT_2, + C3CN_STATE_ABORTING, + C3CN_STATE_CLOSED, +}; + +static inline unsigned int c3cn_is_closing(const struct s3_conn *c3cn) +{ + return c3cn->state >= C3CN_STATE_ACTIVE_CLOSE; +} +static inline unsigned int c3cn_is_established(const struct s3_conn *c3cn) +{ + return c3cn->state == C3CN_STATE_ESTABLISHED; +} + +/* + * Connection flags -- many to track some close related events. + */ +enum c3cn_flags { + C3CN_ABORT_RPL_RCVD, /* received one ABORT_RPL_RSS message */ + C3CN_ABORT_REQ_RCVD, /* received one ABORT_REQ_RSS message */ + C3CN_ABORT_RPL_PENDING, /* expecting an abort reply */ + C3CN_TX_DATA_SENT, /* already sent a TX_DATA WR */ + C3CN_ACTIVE_CLOSE_NEEDED, /* need to be closed */ +}; + +/** + * cxgb3i_sdev_data - Per adapter data. + * Linked off of each Ethernet device port on the adapter. + * Also available via the t3cdev structure since we have pointers to our port + * net_device's there ... + * + * @list: list head to link elements + * @cdev: t3cdev adapter + * @client: CPL client pointer + * @ports: array of adapter ports + * @sport_map_next: next index into the port map + * @sport_map: source port map + */ +struct cxgb3i_sdev_data { + struct list_head list; + struct t3cdev *cdev; + struct cxgb3_client *client; + struct adap_ports ports; + unsigned int sport_map_next; + unsigned long sport_map[0]; +}; +#define NDEV2CDATA(ndev) (*(struct cxgb3i_sdev_data **)&(ndev)->ec_ptr) +#define CXGB3_SDEV_DATA(cdev) NDEV2CDATA((cdev)->lldev) + +void cxgb3i_sdev_cleanup(void); +int cxgb3i_sdev_init(cxgb3_cpl_handler_func *); +void cxgb3i_sdev_add(struct t3cdev *, struct cxgb3_client *); +void cxgb3i_sdev_remove(struct t3cdev *); + +struct s3_conn *cxgb3i_c3cn_create(void); +int cxgb3i_c3cn_connect(struct s3_conn *, struct sockaddr_in *); +void cxgb3i_c3cn_rx_credits(struct s3_conn *, int); +int cxgb3i_c3cn_send_pdus(struct s3_conn *, struct sk_buff *); +void cxgb3i_c3cn_release(struct s3_conn *); + +/** + * cxgb3_skb_cb - control block for received pdu state and ULP mode management. + * + * @flag: see C3CB_FLAG_* below + * @ulp_mode: ULP mode/submode of sk_buff + * @seq: tcp sequence number + * @ddigest: pdu data digest + * @pdulen: recovered pdu length + * @ulp_data: scratch area for ULP + */ +struct cxgb3_skb_cb { + __u8 flags; + __u8 ulp_mode; + __u32 seq; + __u32 ddigest; + __u32 pdulen; + __u8 ulp_data[16]; +}; + +#define CXGB3_SKB_CB(skb) ((struct cxgb3_skb_cb *)&((skb)->cb[0])) + +#define skb_ulp_mode(skb) (CXGB3_SKB_CB(skb)->ulp_mode) +#define skb_ulp_ddigest(skb) (CXGB3_SKB_CB(skb)->ddigest) +#define skb_ulp_pdulen(skb) (CXGB3_SKB_CB(skb)->pdulen) +#define skb_ulp_data(skb) (CXGB3_SKB_CB(skb)->ulp_data) + +enum c3cb_flags { + C3CB_FLAG_NEED_HDR = 1 << 0, /* packet needs a TX_DATA_WR header */ + C3CB_FLAG_NO_APPEND = 1 << 1, /* don't grow this skb */ + C3CB_FLAG_COMPL = 1 << 2, /* request WR completion */ +}; + +/** + * sge_opaque_hdr - + * Opaque version of structure the SGE stores at skb->head of TX_DATA packets + * and for which we must reserve space. + */ +struct sge_opaque_hdr { + void *dev; + dma_addr_t addr[MAX_SKB_FRAGS + 1]; +}; + +/* for TX: a skb must have a headroom of at least TX_HEADER_LEN bytes */ +#define TX_HEADER_LEN \ + (sizeof(struct tx_data_wr) + sizeof(struct sge_opaque_hdr)) + +/* + * get and set private ip for iscsi traffic + */ +#define cxgb3i_get_private_ipv4addr(ndev) \ + (((struct port_info *)(netdev_priv(ndev)))->iscsi_ipv4addr) +#define cxgb3i_set_private_ipv4addr(ndev, addr) \ + (((struct port_info *)(netdev_priv(ndev)))->iscsi_ipv4addr) = addr + +/* max. connections per adapter */ +#define CXGB3I_MAX_CONN 16384 +#endif /* _CXGB3_OFFLOAD_H */ diff --git a/drivers/scsi/cxgb3i/cxgb3i_pdu.c b/drivers/scsi/cxgb3i/cxgb3i_pdu.c new file mode 100644 index 00000000000..ce7ce8c6094 --- /dev/null +++ b/drivers/scsi/cxgb3i/cxgb3i_pdu.c @@ -0,0 +1,402 @@ +/* + * cxgb3i_pdu.c: Chelsio S3xx iSCSI driver. + * + * Copyright (c) 2008 Chelsio Communications, Inc. + * Copyright (c) 2008 Mike Christie + * Copyright (c) 2008 Red Hat, Inc. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation. + * + * Written by: Karen Xie (kxie@chelsio.com) + */ + +#include +#include +#include +#include + +#include "cxgb3i.h" +#include "cxgb3i_pdu.h" + +#ifdef __DEBUG_CXGB3I_RX__ +#define cxgb3i_rx_debug cxgb3i_log_debug +#else +#define cxgb3i_rx_debug(fmt...) +#endif + +#ifdef __DEBUG_CXGB3I_TX__ +#define cxgb3i_tx_debug cxgb3i_log_debug +#else +#define cxgb3i_tx_debug(fmt...) +#endif + +static struct page *pad_page; + +/* + * pdu receive, interact with libiscsi_tcp + */ +static inline int read_pdu_skb(struct iscsi_conn *conn, struct sk_buff *skb, + unsigned int offset, int offloaded) +{ + int status = 0; + int bytes_read; + + bytes_read = iscsi_tcp_recv_skb(conn, skb, offset, offloaded, &status); + switch (status) { + case ISCSI_TCP_CONN_ERR: + return -EIO; + case ISCSI_TCP_SUSPENDED: + /* no transfer - just have caller flush queue */ + return bytes_read; + case ISCSI_TCP_SKB_DONE: + /* + * pdus should always fit in the skb and we should get + * segment done notifcation. + */ + iscsi_conn_printk(KERN_ERR, conn, "Invalid pdu or skb."); + return -EFAULT; + case ISCSI_TCP_SEGMENT_DONE: + return bytes_read; + default: + iscsi_conn_printk(KERN_ERR, conn, "Invalid iscsi_tcp_recv_skb " + "status %d\n", status); + return -EINVAL; + } +} + +static int cxgb3i_conn_read_pdu_skb(struct iscsi_conn *conn, + struct sk_buff *skb) +{ + struct iscsi_tcp_conn *tcp_conn = conn->dd_data; + bool offloaded = 0; + unsigned int offset; + int rc; + + cxgb3i_rx_debug("conn 0x%p, skb 0x%p, len %u, flag 0x%x.\n", + conn, skb, skb->len, skb_ulp_mode(skb)); + + if (!iscsi_tcp_recv_segment_is_hdr(tcp_conn)) { + iscsi_conn_failure(conn, ISCSI_ERR_PROTO); + return -EIO; + } + + if (conn->hdrdgst_en && (skb_ulp_mode(skb) & ULP2_FLAG_HCRC_ERROR)) { + iscsi_conn_failure(conn, ISCSI_ERR_HDR_DGST); + return -EIO; + } + + if (conn->datadgst_en && (skb_ulp_mode(skb) & ULP2_FLAG_DCRC_ERROR)) { + iscsi_conn_failure(conn, ISCSI_ERR_DATA_DGST); + return -EIO; + } + + /* iscsi hdr */ + rc = read_pdu_skb(conn, skb, 0, 0); + if (rc <= 0) + return rc; + + if (iscsi_tcp_recv_segment_is_hdr(tcp_conn)) + return 0; + + offset = rc; + if (conn->hdrdgst_en) + offset += ISCSI_DIGEST_SIZE; + + /* iscsi data */ + if (skb_ulp_mode(skb) & ULP2_FLAG_DATA_DDPED) { + cxgb3i_rx_debug("skb 0x%p, opcode 0x%x, data %u, ddp'ed, " + "itt 0x%x.\n", + skb, + tcp_conn->in.hdr->opcode & ISCSI_OPCODE_MASK, + tcp_conn->in.datalen, + ntohl(tcp_conn->in.hdr->itt)); + offloaded = 1; + } else { + cxgb3i_rx_debug("skb 0x%p, opcode 0x%x, data %u, NOT ddp'ed, " + "itt 0x%x.\n", + skb, + tcp_conn->in.hdr->opcode & ISCSI_OPCODE_MASK, + tcp_conn->in.datalen, + ntohl(tcp_conn->in.hdr->itt)); + offset += sizeof(struct cpl_iscsi_hdr_norss); + } + + rc = read_pdu_skb(conn, skb, offset, offloaded); + if (rc < 0) + return rc; + else + return 0; +} + +/* + * pdu transmit, interact with libiscsi_tcp + */ +static inline void tx_skb_setmode(struct sk_buff *skb, int hcrc, int dcrc) +{ + u8 submode = 0; + + if (hcrc) + submode |= 1; + if (dcrc) + submode |= 2; + skb_ulp_mode(skb) = (ULP_MODE_ISCSI << 4) | submode; +} + +void cxgb3i_conn_cleanup_task(struct iscsi_task *task) +{ + struct iscsi_tcp_task *tcp_task = task->dd_data; + + /* never reached the xmit task callout */ + if (tcp_task->dd_data) + kfree_skb(tcp_task->dd_data); + tcp_task->dd_data = NULL; + + /* MNC - Do we need a check in case this is called but + * cxgb3i_conn_alloc_pdu has never been called on the task */ + cxgb3i_release_itt(task, task->hdr_itt); + iscsi_tcp_cleanup_task(task); +} + +/* + * We do not support ahs yet + */ +int cxgb3i_conn_alloc_pdu(struct iscsi_task *task, u8 opcode) +{ + struct iscsi_tcp_task *tcp_task = task->dd_data; + struct sk_buff *skb; + + task->hdr = NULL; + /* always allocate rooms for AHS */ + skb = alloc_skb(sizeof(struct iscsi_hdr) + ISCSI_MAX_AHS_SIZE + + TX_HEADER_LEN, GFP_ATOMIC); + if (!skb) + return -ENOMEM; + + cxgb3i_tx_debug("task 0x%p, opcode 0x%x, skb 0x%p.\n", + task, opcode, skb); + + tcp_task->dd_data = skb; + skb_reserve(skb, TX_HEADER_LEN); + task->hdr = (struct iscsi_hdr *)skb->data; + task->hdr_max = sizeof(struct iscsi_hdr); + + /* data_out uses scsi_cmd's itt */ + if (opcode != ISCSI_OP_SCSI_DATA_OUT) + cxgb3i_reserve_itt(task, &task->hdr->itt); + + return 0; +} + +int cxgb3i_conn_init_pdu(struct iscsi_task *task, unsigned int offset, + unsigned int count) +{ + struct iscsi_tcp_task *tcp_task = task->dd_data; + struct sk_buff *skb = tcp_task->dd_data; + struct iscsi_conn *conn = task->conn; + struct page *pg; + unsigned int datalen = count; + int i, padlen = iscsi_padding(count); + skb_frag_t *frag; + + cxgb3i_tx_debug("task 0x%p,0x%p, offset %u, count %u, skb 0x%p.\n", + task, task->sc, offset, count, skb); + + skb_put(skb, task->hdr_len); + tx_skb_setmode(skb, conn->hdrdgst_en, datalen ? conn->datadgst_en : 0); + if (!count) + return 0; + + if (task->sc) { + struct scatterlist *sg; + struct scsi_data_buffer *sdb; + unsigned int sgoffset = offset; + struct page *sgpg; + unsigned int sglen; + + sdb = scsi_out(task->sc); + sg = sdb->table.sgl; + + for_each_sg(sdb->table.sgl, sg, sdb->table.nents, i) { + cxgb3i_tx_debug("sg %d, page 0x%p, len %u offset %u\n", + i, sg_page(sg), sg->length, sg->offset); + + if (sgoffset < sg->length) + break; + sgoffset -= sg->length; + } + sgpg = sg_page(sg); + sglen = sg->length - sgoffset; + + do { + int j = skb_shinfo(skb)->nr_frags; + unsigned int copy; + + if (!sglen) { + sg = sg_next(sg); + sgpg = sg_page(sg); + sgoffset = 0; + sglen = sg->length; + ++i; + } + copy = min(sglen, datalen); + if (j && skb_can_coalesce(skb, j, sgpg, + sg->offset + sgoffset)) { + skb_shinfo(skb)->frags[j - 1].size += copy; + } else { + get_page(sgpg); + skb_fill_page_desc(skb, j, sgpg, + sg->offset + sgoffset, copy); + } + sgoffset += copy; + sglen -= copy; + datalen -= copy; + } while (datalen); + } else { + pg = virt_to_page(task->data); + + while (datalen) { + i = skb_shinfo(skb)->nr_frags; + frag = &skb_shinfo(skb)->frags[i]; + + get_page(pg); + frag->page = pg; + frag->page_offset = 0; + frag->size = min((unsigned int)PAGE_SIZE, datalen); + + skb_shinfo(skb)->nr_frags++; + datalen -= frag->size; + pg++; + } + } + + if (padlen) { + i = skb_shinfo(skb)->nr_frags; + frag = &skb_shinfo(skb)->frags[i]; + frag->page = pad_page; + frag->page_offset = 0; + frag->size = padlen; + skb_shinfo(skb)->nr_frags++; + } + + datalen = count + padlen; + skb->data_len += datalen; + skb->truesize += datalen; + skb->len += datalen; + return 0; +} + +int cxgb3i_conn_xmit_pdu(struct iscsi_task *task) +{ + struct iscsi_tcp_task *tcp_task = task->dd_data; + struct sk_buff *skb = tcp_task->dd_data; + struct iscsi_tcp_conn *tcp_conn = task->conn->dd_data; + struct cxgb3i_conn *cconn = tcp_conn->dd_data; + unsigned int datalen; + int err; + + if (!skb) + return 0; + + datalen = skb->data_len; + tcp_task->dd_data = NULL; + err = cxgb3i_c3cn_send_pdus(cconn->cep->c3cn, skb); + cxgb3i_tx_debug("task 0x%p, skb 0x%p, len %u/%u, rv %d.\n", + task, skb, skb->len, skb->data_len, err); + if (err > 0) { + int pdulen = err; + + if (task->conn->hdrdgst_en) + pdulen += ISCSI_DIGEST_SIZE; + if (datalen && task->conn->datadgst_en) + pdulen += ISCSI_DIGEST_SIZE; + + task->conn->txdata_octets += pdulen; + return 0; + } + + if (err < 0 && err != -EAGAIN) { + kfree_skb(skb); + cxgb3i_tx_debug("itt 0x%x, skb 0x%p, len %u/%u, xmit err %d.\n", + task->itt, skb, skb->len, skb->data_len, err); + iscsi_conn_printk(KERN_ERR, task->conn, "xmit err %d.\n", err); + iscsi_conn_failure(task->conn, ISCSI_ERR_XMIT_FAILED); + return err; + } + /* reset skb to send when we are called again */ + tcp_task->dd_data = skb; + return -EAGAIN; +} + +int cxgb3i_pdu_init(void) +{ + pad_page = alloc_page(GFP_KERNEL); + if (!pad_page) + return -ENOMEM; + memset(page_address(pad_page), 0, PAGE_SIZE); + return 0; +} + +void cxgb3i_pdu_cleanup(void) +{ + if (pad_page) { + __free_page(pad_page); + pad_page = NULL; + } +} + +void cxgb3i_conn_pdu_ready(struct s3_conn *c3cn) +{ + struct sk_buff *skb; + unsigned int read = 0; + struct iscsi_conn *conn = c3cn->user_data; + int err = 0; + + cxgb3i_rx_debug("cn 0x%p.\n", c3cn); + + read_lock(&c3cn->callback_lock); + if (unlikely(!conn || conn->suspend_rx)) { + cxgb3i_rx_debug("conn 0x%p, id %d, suspend_rx %lu!\n", + conn, conn ? conn->id : 0xFF, + conn ? conn->suspend_rx : 0xFF); + read_unlock(&c3cn->callback_lock); + return; + } + skb = skb_peek(&c3cn->receive_queue); + while (!err && skb) { + __skb_unlink(skb, &c3cn->receive_queue); + read += skb_ulp_pdulen(skb); + err = cxgb3i_conn_read_pdu_skb(conn, skb); + __kfree_skb(skb); + skb = skb_peek(&c3cn->receive_queue); + } + read_unlock(&c3cn->callback_lock); + if (c3cn) { + c3cn->copied_seq += read; + cxgb3i_c3cn_rx_credits(c3cn, read); + } + conn->rxdata_octets += read; +} + +void cxgb3i_conn_tx_open(struct s3_conn *c3cn) +{ + struct iscsi_conn *conn = c3cn->user_data; + + cxgb3i_tx_debug("cn 0x%p.\n", c3cn); + if (conn) { + cxgb3i_tx_debug("cn 0x%p, cid %d.\n", c3cn, conn->id); + scsi_queue_work(conn->session->host, &conn->xmitwork); + } +} + +void cxgb3i_conn_closing(struct s3_conn *c3cn) +{ + struct iscsi_conn *conn; + + read_lock(&c3cn->callback_lock); + conn = c3cn->user_data; + if (conn && c3cn->state != C3CN_STATE_ESTABLISHED) + iscsi_conn_failure(conn, ISCSI_ERR_CONN_FAILED); + read_unlock(&c3cn->callback_lock); +} diff --git a/drivers/scsi/cxgb3i/cxgb3i_pdu.h b/drivers/scsi/cxgb3i/cxgb3i_pdu.h new file mode 100644 index 00000000000..a3f685cc236 --- /dev/null +++ b/drivers/scsi/cxgb3i/cxgb3i_pdu.h @@ -0,0 +1,59 @@ +/* + * cxgb3i_ulp2.h: Chelsio S3xx iSCSI driver. + * + * Copyright (c) 2008 Chelsio Communications, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation. + * + * Written by: Karen Xie (kxie@chelsio.com) + */ + +#ifndef __CXGB3I_ULP2_PDU_H__ +#define __CXGB3I_ULP2_PDU_H__ + +struct cpl_iscsi_hdr_norss { + union opcode_tid ot; + u16 pdu_len_ddp; + u16 len; + u32 seq; + u16 urg; + u8 rsvd; + u8 status; +}; + +struct cpl_rx_data_ddp_norss { + union opcode_tid ot; + u16 urg; + u16 len; + u32 seq; + u32 nxt_seq; + u32 ulp_crc; + u32 ddp_status; +}; + +#define RX_DDP_STATUS_IPP_SHIFT 27 /* invalid pagepod */ +#define RX_DDP_STATUS_TID_SHIFT 26 /* tid mismatch */ +#define RX_DDP_STATUS_COLOR_SHIFT 25 /* color mismatch */ +#define RX_DDP_STATUS_OFFSET_SHIFT 24 /* offset mismatch */ +#define RX_DDP_STATUS_ULIMIT_SHIFT 23 /* ulimit error */ +#define RX_DDP_STATUS_TAG_SHIFT 22 /* tag mismatch */ +#define RX_DDP_STATUS_DCRC_SHIFT 21 /* dcrc error */ +#define RX_DDP_STATUS_HCRC_SHIFT 20 /* hcrc error */ +#define RX_DDP_STATUS_PAD_SHIFT 19 /* pad error */ +#define RX_DDP_STATUS_PPP_SHIFT 18 /* pagepod parity error */ +#define RX_DDP_STATUS_LLIMIT_SHIFT 17 /* llimit error */ +#define RX_DDP_STATUS_DDP_SHIFT 16 /* ddp'able */ +#define RX_DDP_STATUS_PMM_SHIFT 15 /* pagepod mismatch */ + +#define ULP2_FLAG_DATA_READY 0x1 +#define ULP2_FLAG_DATA_DDPED 0x2 +#define ULP2_FLAG_HCRC_ERROR 0x10 +#define ULP2_FLAG_DCRC_ERROR 0x20 +#define ULP2_FLAG_PAD_ERROR 0x40 + +void cxgb3i_conn_closing(struct s3_conn *); +void cxgb3i_conn_pdu_ready(struct s3_conn *c3cn); +void cxgb3i_conn_tx_open(struct s3_conn *c3cn); +#endif -- cgit v1.2.3-70-g09d2 From fb5edd020fa0fbe991f4a473611ad530d2237425 Mon Sep 17 00:00:00 2001 From: James Bottomley Date: Tue, 30 Dec 2008 09:44:29 -0600 Subject: [SCSI] fcoe: fix configuration problems fcoe selects libfc and requires SCSI and PCI (the SCSI requirement is implicitly covered by an enclosing if). Fix them both up so they cannot be configured in an invalid state: make LIBFC select SCSI_FC_ATTRS and make FCOE depend on PCI and select LIBFC. Reported-by: Randy Dunlap Cc: Robert Love Signed-off-by: James Bottomley --- drivers/scsi/Kconfig | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'drivers/scsi/Kconfig') diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig index 0e5e084dfb4..152d4aa9354 100644 --- a/drivers/scsi/Kconfig +++ b/drivers/scsi/Kconfig @@ -607,13 +607,13 @@ config SCSI_FLASHPOINT config LIBFC tristate "LibFC module" - depends on SCSI && SCSI_FC_ATTRS + select SCSI_FC_ATTRS ---help--- Fibre Channel library module config FCOE tristate "FCoE module" - depends on SCSI + depends on PCI select LIBFC ---help--- Fibre Channel over Ethernet module -- cgit v1.2.3-70-g09d2