0

I'm using netlink socket to communicate user python code with linux kernel, i can send message properly from user to kernel but i cant get the response back from kernel. it shows: "Error while sending bak to user.." in my peice of kernel code

I tried all the methode to get the response message : recv , recvfrom recvmsg , then my purpose is to unpack it to get the payload of the response which have the format : response_message = header + payload, but all the time the user failed to get the response. I get this when executing the user python code:

pid of sock : 1
seq number of sock : 0
pid of the message send to kernel:  1
seq number of the message send to kernel :  0
return of send  21
waiting for kernel ..

in the kernel side when check the syslog file I find this :

Jul 14 03:57:32 Bob kernel: [12381.663131] Kernel received :hello  id:XXXX!
Jul 14 03:57:32 Bob kernel: [12381.663132] python pid :1 XXXX
Jul 14 03:57:32 Bob kernel: [12381.663133] seq newsock : 0 id:XXXX!
Jul 14 03:57:32 Bob kernel: [12381.663134] Error while sending back to user  id:XXXX!

I was thinking that the problem may be caused by the seq number witch can be invalid between the messages of the sender and the receiver so I tried to add flags(NLM_F_REQUEST) and using other message types (MSG_SETCFG = 11 ,MSG_GETCFG = 12)but it didnt work

can anyone who is familiar with netlink socket help me to figure out who to fix this situation

here my user python code:

import os
import socket
import struct
import ctypes


# types
NLMSG_NOOP = 1
NLMSG_ERROR = 2
NLMSG_DONE = 3
NLMSG_OVERRUN = 4
MSG_SETCFG = 11
MSG_GETCFG = 12
NLMSG_MIN_TYPE = 0x10

# flags
NLM_F_REQUEST = 1
NLM_F_MULTI = 2
NLM_F_ACK = 4
NLM_F_ECHO = 8



class Message:
    def __init__(self, msg_type, flags=0, seq=-1, payload=None):
        self.type = msg_type
        self.flags = flags
        self.seq = seq
        self.pid = 1
        payload = payload or []
        if isinstance(payload, list):
            contents = []
            for attr in payload:
                contents.append(attr._dump())
            self.payload = b''.join(contents)
        else:
            self.payload = payload

    def send(self, conn):
        if self.seq == -1:
            self.seq = conn.seq()

        self.pid = conn.pid
        length = len(self.payload)

        hdr = struct.pack("IHHII", length + 4 * 4, self.type,
                          self.flags, self.seq, self.pid)
        conn.send(hdr + bytes(self.payload, 'utf-8'))

 
class Connection(object):
    """
    Object representing Netlink socket connection to the kernel.
    """
    def __init__(self, nlservice=31, groups=0):
        # nlservice = Netlink IP service
        self.fd = socket.socket(socket.AF_NETLINK, socket.SOCK_RAW, nlservice)
        self.fd.setsockopt(socket.SOL_SOCKET, socket.SO_SNDBUF, 65536)
        self.fd.setsockopt(socket.SOL_SOCKET, socket.SO_RCVBUF, 65536)
        self.fd.bind((0, groups)) # pid=0 lets kernel assign socket PID
        self.pid, self.groups = self.fd.getsockname()
        self.pid = 1
        self._seq = 0
        
    def send(self, msg):
        if isinstance(msg, Message):
            if msg.seq == -1: 
                msg.seq = self.seq()
            #msg.seq = 1
            msg.pid = self.pid
            length = len(msg.payload)
            hdr = struct.pack("IHHII", length + 4 * 4, msg.type,
                          msg.flags, msg.seq, msg.pid) 
            msg = hdr + msg.payload.encode('utf-8')
            return self.fd.send(msg)
     
    def recve(self):
        #data, (nlpid, nlgrps) =  self.fd.recvfrom(16384)
        data = self.fd.recv(16384)
        msglen, msg_type, flags, seq, pid = struct.unpack("IHHII", data[:16])
        msg = Message(msg_type, flags, seq, data[16:])
        msg.pid = pid
        if msg_type == NLMSG_DONE:
           print("payload :", msg.payload)
           print("msg.pid :", msg.pid)
           print("msg.seq :", msg.seq)
        if msg.type == NLMSG_ERROR:
            errno = -struct.unpack("i", msg.payload[:4])[0]
            if errno != 0:
                err = OSError("Netlink error: %s (%d)" % (
                                                    os.strerror(errno), errno))
                err.errno = errno
                print("err :",err)
                raise err
        
        #return msg.payload 
         return msg 
        
    def seq(self):
        self._seq += 1
        return self._seq


sock = Connection()

var = 'hello'
msg1 = Message(3,0,-1,var) 
print("pid of sock :", sock.pid)
print ("seq number of sock :",sock._seq)

#res1 = msg1.send(sock)
res1 = sock.send(msg1)
print("pid of the message send to kernel: ", msg1.pid)
print("seq number of the message send to kernel : ", msg1.seq)

print("return of send ", res1)

while 1:
   print("waiting for kernel ..")
   #msgreply = sock.fd.recvmsg(16384)
   res2 = sock.recve()
   print("return of recive ", res2)

and here is my kernel code :

#include <linux/module.h>
#include <net/mptcp.h>


/********Test deb***********/

#include <linux/module.h>  
#include <linux/kernel.h>  
#include <linux/init.h>  
#include <net/sock.h>  
#include <linux/socket.h>  
#include <linux/net.h>  
#include <asm/types.h>  
#include <linux/netlink.h>  
#include <linux/netlink.h>  
#include <linux/skbuff.h> 
#include <linux/inetdevice.h>
#include <linux/uio.h>
/****** other lib for other test github ******/
#include <linux/gfp.h>
#include <linux/kprobes.h>
#include <linux/ptrace.h>
#include <linux/time.h>
#include <net/net_namespace.h>


#define NETLINK_USER 31

#define MSG_SETCFG      0x11

#define MSG_GETCFG      0x12

//#define NETLINK_USERROCK 31

struct sock *nl_sk = NULL;


static void hello_nl_recv_msg(struct sk_buff *skb) {

struct nlmsghdr *nlh;
int pid, seq;
struct sk_buff *skb_out;
int msg_size;
char *msg="Helloo";
char *msgg;
int res;
char *recive;

//printk(KERN_DEBUG "KERNEL MODE id:XXXX!\n");
printk(KERN_INFO "Entering: %s  id:XXXX!\n", __FUNCTION__);
msg_size = strlen(msg);

////recive ////

nlh=(struct nlmsghdr*)skb->data;
//nlh = nlmsg_hdr(skb);
msgg=(char *)NLMSG_DATA(nlh);

printk(KERN_INFO "Kernel received :%s  id:XXXX!\n",msgg);

pid = nlh->nlmsg_pid; //pid of sending process /
//pid = 1; //just set it to 1 like we did in python
printk(KERN_INFO "python pid :%d XXXX", pid);

////////end recive /


///////sending message /

skb_out = nlmsg_new(msg_size,0);

if(!skb_out) {
    printk(KERN_ERR "Failed to allocate new skb  id:XXXX!\n");
    return;
}

//nlh = nlmsg_put(skb, 0, 1, NLMSG_DONE, msg_size + 1, 0);
nlh=nlmsg_put(skb_out,0,0,NLMSG_DONE,msg_size, 0);

NETLINK_CB(skb_out).dst_group = 0; //not in mcast group /
strncpy(nlmsg_data(nlh),msg,msg_size);

seq = nlh->nlmsg_seq;
printk(KERN_INFO "seq newsock : %d id:XXXX!\n", seq);

if(!nlmsg_data(nlh)) {
    printk(KERN_ERR "Failed to copy the message skbout id:XXXX!\n");
    return;
}
   
//printk(KERN_INFO "copied message :" , nlmsg_data(nlh));
//changed pid to 0
res=nlmsg_unicast(nl_sk,skb_out, pid);

if(res<0)
    printk(KERN_INFO "Error while sending back to user  id:XXXX!\n");

/////end send /

}


static int __init hello_init(void) {

printk("Entering: %s id:XXXX!\n",__FUNCTION__);
//This is for 3.6 kernels and above.
struct netlink_kernel_cfg cfg = {
    .input = hello_nl_recv_msg,
};

nl_sk = netlink_kernel_create(&init_net, NETLINK_USER, &cfg);
//nl_sk = netlink_kernel_create(&init_net, NETLINK_USER, 0, hello_nl_recv_msg,NULL,THIS_MODULE);
if(!nl_sk)
{
    printk(KERN_ALERT "Error creating socket.  id:XXXX!\n");
    return -10;

}

return 0;
}

static void __exit hello_exit(void) {

printk(KERN_INFO "exiting hello module  id:XXXX!\n");
netlink_kernel_release(nl_sk);
}

module_init(hello_init); module_exit(hello_exit);

MODULE_LICENSE("GPL");
Meri
  • 1

2 Answers2

0

Two things to note in kernel module

  1. What is the purpose of line "if(!nlmsg_data(nlh)) " because we have already associated payload
  2. What is return value of nlmsg_unicast, as it would give us idea why it is returning error
tej parkash
  • 137
  • 2
  • Thak you for responding, I added the line of "if(!nlmsg_data(nlh)) when I was debugging to know if all the lines befor nlmsg_unicast was correct and not causing any error,so I 'll be sur that the creation of the socket and the message to be send to user was running correctly.I just forget to delete it before posting my questions – Meri Jul 14 '21 at 10:15
  • Sure @Bob. What is return value of nlmsg_unicast to understand this issue further – tej parkash Jul 14 '21 at 10:37
  • I did some more researchs and found that the errno -111 generated by the nlmsg_unicast() means connection refused and this is maybe because the kernel respond to message received from user by a2 mssgs (an ack and the actual respose), and because our kernel does not send the response immediately,so the user receive the ack and parse it incorrectly as if it is the real respose and "close the sockets" (idk but that seems like its not like my case where I'm doing the recv() in while 1).the kernel didnt find the user in lestening when it try to send the actual response. – Meri Jul 14 '21 at 16:32
  • the solution proposed in https://stackoverflow.com/questions/34066680/failure-while-unicast-data-from-kernel-to-user-space-via-netlink is to make client expect to response so ignore the first one (the ack ) and receive the seconde one which is the actual response but Idk how to do this exactely in my code – Meri Jul 14 '21 at 16:37
  • otherwise I think to desable the returned ack from the kernel but still dont find the appropriate code for that. ps: this is just the first version of my code. in fact I aime to send mssg from user to kernel just for inisiating the communication,then the kernel will repeatdly send request and receive response – Meri Jul 14 '21 at 16:47
0

I tried the python code with my own kernel module using Netlink and after changing self.pid = 1 to self.pid = os.getpid().

I was able to get a response from the kernel module. When setting to 1, I had the same issue where kernel module gave error -111.

Peter Csala
  • 17,736
  • 16
  • 35
  • 75