ipset如何與netfilter核心模組進行通訊
最近需要使用ipset,iptables,和netfilter,所以把三者的原始碼看大概閱讀了一遍。
前面我們學習過應用層ipset和netfilter模組之間通訊是採用的netlink套接字
使用者空間的ipset命令通過 libipset.so 這個庫和核心通訊
一、ipset主流程
下面是我總結的主流程
二、使用者層如何將建立set的名稱和型別傳遞到核心層的
我們都知道ipset可以建立不同型別set,如"hash:ip","hash:ip,port","hash:net,port"等
從執行命令到核心態,其流程為
ipset命令列 -> libipset.so -> ip_set.ko核心模組 ->根據set型別選擇ip_set_hash_ip.ko核心模組
那麼應用層是如何解析set的命令和型別的,並且是如何將set名稱和型別傳遞到核心態的呢?
ipset_parse_argv函式中去解析ipset的Create命令
ipset_parse_setname是解析剛建立ipet集合的名稱
ipset_parse_typename是解析剛建立ipset集合的型別
ret = ipset_parse_setname(session, IPSET_SETNAME, arg0);函式是將將arg0的值傳遞了session的setname成員
因為我更關注set type型別,所以進入ipset_parse_typename函式
/* Find the corresponding type */
typename = ipset_typename_resolve(str);
通過註釋可得知,ipset_typename_resolve是找到對應的set型別,大膽猜想下,命令列是“hash:ip”,通過“hash:ip”我們能夠獲取到typename型別名稱
報告大哥,發現線索typelist連結串列,函式意思是遍歷typelist連結串列,用ipset_match_typename()來匹配型別名稱,匹配成功則返回型別名稱。現在需要找到往typelist連結串列中新增元素的函式!!!
ipset_type_add函式!
ipset_type_add函式!
ipset_type_add函式!
重要的事情說三遍!看看它的英文註釋,如下:
/**
* ipset_type_add - add (register) a userspace set type
* @type: pointer to the set type structure
*
* Add the given set type to the type list. The types
* are added sorted, in descending revision number.
*
* Returns 0 on success or a negative error code.
*/
新增一個使用者態的set集合型別,ok,找到了,only you
何人在呼叫ipset_type_add函式?
開啟ipset_hash_ip.c檔案,找到其_init函式
看看ipset_hash_ip0結構體定義和初始化
/* Initial release */
static struct ipset_type ipset_hash_ip0 = {
.name = "hash:ip",
.alias = { "iphash", NULL },
.revision = 0,
.family = NFPROTO_IPSET_IPV46,
.dimension = IPSET_DIM_ONE,
.elem = {
[IPSET_DIM_ONE - 1] = {
.parse = ipset_parse_ip4_single6,
.print = ipset_print_ip,
.opt = IPSET_OPT_IP
},
},
.cmd = {
[IPSET_CREATE] = {
.args = {
IPSET_ARG_FAMILY,
/* Aliases */
IPSET_ARG_INET,
IPSET_ARG_INET6,
IPSET_ARG_HASHSIZE,
IPSET_ARG_MAXELEM,
IPSET_ARG_NETMASK,
IPSET_ARG_TIMEOUT,
/* Ignored options: backward compatibilty */
IPSET_ARG_PROBES,
IPSET_ARG_RESIZE,
IPSET_ARG_GC,
IPSET_ARG_NONE,
},
.need = 0,
.full = 0,
.help = "",
},
[IPSET_ADD] = {
.args = {
IPSET_ARG_TIMEOUT,
IPSET_ARG_NONE,
},
.need = IPSET_FLAG(IPSET_OPT_IP),
.full = IPSET_FLAG(IPSET_OPT_IP)
| IPSET_FLAG(IPSET_OPT_IP_TO),
.help = "IP",
},
[IPSET_DEL] = {
.args = {
IPSET_ARG_NONE,
},
.need = IPSET_FLAG(IPSET_OPT_IP),
.full = IPSET_FLAG(IPSET_OPT_IP)
| IPSET_FLAG(IPSET_OPT_IP_TO),
.help = "IP",
},
[IPSET_TEST] = {
.args = {
IPSET_ARG_NONE,
},
.need = IPSET_FLAG(IPSET_OPT_IP),
.full = IPSET_FLAG(IPSET_OPT_IP)
| IPSET_FLAG(IPSET_OPT_IP_TO),
.help = "IP",
},
},
.usage = "where depending on the INET family\n"
" IP is a valid IPv4 or IPv6 address (or hostname),\n"
" CIDR is a valid IPv4 or IPv6 CIDR prefix.\n"
" Adding/deleting multiple elements in IP/CIDR or FROM-TO form\n"
" is supported for IPv4.",
.description = "Initial revision",
};
//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
三、netlink套接字初始化
想使用netlink套接字,必然要先建立netlink套接字,應該有如下程式碼
skfd = socket(AF_NETLINK, SOCK_RAW, NETLINK_TEST);
但是我並沒有在ipset原始碼中查詢到。
後來在書上看到ipset原始碼中是採用libmnl庫來使用netlink套接字,使用ipset_mnl_init函式來進行初始化操作
static struct ipset_handle *
ipset_mnl_init(mnl_cb_t *cb_ctl, void *data)
{
struct ipset_handle *handle;
assert(cb_ctl);
assert(data);
handle = calloc(1, sizeof(*handle));
if (!handle)
return NULL;
handle->h = mnl_socket_open(NETLINK_NETFILTER);
if (!handle->h)
goto free_handle;
if (mnl_socket_bind(handle->h, 0, MNL_SOCKET_AUTOPID) < 0)
goto close_nl;
handle->portid = mnl_socket_get_portid(handle->h);
handle->cb_ctl = cb_ctl;
handle->data = data;
handle->seq = time(NULL);
return handle;
close_nl:
mnl_socket_close(handle->h);
free_handle:
free(handle);
return NULL;
}
mnl_socket_open函式傳遞NETLINK_NETFILTER型別,建立netlink套接字
mnl_socket_bind繫結程序pid,此處傳遞的是MNL_SOCKET_AUTOPID
mnl_socket_get_portid 通過給定的netlink套接字獲取netlink埠id
四、通過netlink函式和核心態進行互動
在同一個檔案mnl.c中發現ipset_mnl_query函式,其中呼叫了mnl_socket_recvfrom和mnl_socket_sendto,和核心態進行通訊
下面就是看下libmnl的api官方文件
static int
ipset_mnl_query(struct ipset_handle *handle, void *buffer, size_t len)
{
struct nlmsghdr *nlh = buffer;
int ret;
assert(handle);
assert(buffer);
nlh->nlmsg_seq = ++handle->seq;
#ifdef IPSET_DEBUG
ipset_debug_msg("sent", nlh, nlh->nlmsg_len);
#endif
if (mnl_socket_sendto(handle->h, nlh, nlh->nlmsg_len) < 0)
return -ECOMM;
ret = mnl_socket_recvfrom(handle->h, buffer, len);
#ifdef IPSET_DEBUG
ipset_debug_msg("received", buffer, ret);
#endif
while (ret > 0) {
ret = mnl_cb_run2(buffer, ret,
handle->seq, handle->portid,
handle->cb_ctl[NLMSG_MIN_TYPE],
handle->data,
handle->cb_ctl, NLMSG_MIN_TYPE);
D("nfln_cb_run2, ret: %d, errno %d", ret, errno);
if (ret <= 0)
break;
ret = mnl_socket_recvfrom(handle->h, buffer, len);
D("message received, ret: %d", ret);
}
return ret;
}
關於mnl_socket_recvfrom和mnl_socket_sendto和mnl_cb_run2函式的含義,請自行查詢api
使用者態和核心態通訊,必然會遵循某種特定的規則,我們稱之為通訊規則
在ip_set.h檔案中,有如下命令的定義
/* Message types and commands */
enum ipset_cmd {
IPSET_CMD_NONE,
IPSET_CMD_PROTOCOL, /* 1: Return protocol version */
IPSET_CMD_CREATE, /* 2: Create a new (empty) set */
IPSET_CMD_DESTROY, /* 3: Destroy a (empty) set */
IPSET_CMD_FLUSH, /* 4: Remove all elements from a set */
IPSET_CMD_RENAME, /* 5: Rename a set */
IPSET_CMD_SWAP, /* 6: Swap two sets */
IPSET_CMD_LIST, /* 7: List sets */
IPSET_CMD_SAVE, /* 8: Save sets */
IPSET_CMD_ADD, /* 9: Add an element to a set */
IPSET_CMD_DEL, /* 10: Delete an element from a set */
IPSET_CMD_TEST, /* 11: Test an element in a set */
IPSET_CMD_HEADER, /* 12: Get set header data only */
IPSET_CMD_TYPE, /* 13: Get set type */
IPSET_CMD_GET_BYNAME, /* 14: Get set index by name */
IPSET_CMD_GET_BYINDEX, /* 15: Get set name by index */
IPSET_MSG_MAX, /* Netlink message commands */
/* Commands in userspace: */
IPSET_CMD_RESTORE = IPSET_MSG_MAX, /* 16: Enter restore mode */
IPSET_CMD_HELP, /* 17: Get help */
IPSET_CMD_VERSION, /* 18: Get program version */
IPSET_CMD_QUIT, /* 19: Quit from interactive mode */
IPSET_CMD_MAX,
IPSET_CMD_COMMIT = IPSET_CMD_MAX, /* 20: Commit buffered commands */
};
這裡我們以IPSET_CMD_CREATE為例子,在核心程式碼(我的核心版本是3.10)搜尋IPSET_CMD_CREATE
找到如下的結構體
static const struct nfnl_callback ip_set_netlink_subsys_cb[IPSET_MSG_MAX] = {
[IPSET_CMD_NONE] = {
.call = ip_set_none,
.attr_count = IPSET_ATTR_CMD_MAX,
},
[IPSET_CMD_CREATE] = {
.call = ip_set_create,
.attr_count = IPSET_ATTR_CMD_MAX,
.policy = ip_set_create_policy,
},
[IPSET_CMD_DESTROY] = {
.call = ip_set_destroy,
.attr_count = IPSET_ATTR_CMD_MAX,
.policy = ip_set_setname_policy,
},
[IPSET_CMD_FLUSH] = {
.call = ip_set_flush,
.attr_count = IPSET_ATTR_CMD_MAX,
.policy = ip_set_setname_policy,
},
}
上面標明IPSET_CMD_CREATE命令的處理函式為ip_set_create
此時從使用者態傳送命令到核心態
核心態響應使用者態的命令
流程已經跑通了。