1

I'm working on a project The project in c and The point of it is to create a multiple threads and work with them ... the problem is The program worked fine on my Macos but when I'm trying to work on the project form Kali VM or WSL (Windows Subsystem for Linux). the same code gaves me the following error

the error on kali VM

└─$ ./a.out 3 800 200 200 1200                                                                                                                                                                 134 ⨯
malloc(): corrupted top size
zsh: abort      ./a.out 3 800 200 200 1200

The error on WSL

└─$ ./a.out 2 60 60 20 
malloc(): corrupted top size
Aborted (core dumped)

you can check the full code here in this repo.

this is the main file of the code:

#include "philosophers.h"

int ft_error_put(char *messsage, int ret)
{
    printf("%s\n", messsage);
    return (ret);
}

int ft_parsing(char **av, t_simulation *simulation)
{
    int             num;
    int             i;
    int             j;

    i = 1;
    j = 0;
    while (av[i])
    {
        j = 0;
        num = 0;
        while (av[i][j])
        {
            if (av[i][j] >= '0' && av[i][j] <= '9')
                 num = num * 10 + (av[i][j] - '0');
            else
                return (ft_error_put("Error: Number Only", 1));
            j++;
        }
        if (i == 1)
        {
            simulation->philo_numbers = num;
            simulation->forks = num;
            simulation->threads = (pthread_t *)malloc(sizeof(pthread_t) * num);
        }
        else if (i == 2)
            simulation->time_to_die = num;
        else if (i == 3)
            simulation->time_to_eat = num;
        else if (i == 4)
            simulation->time_to_sleep = num;
        else if (i == 5)
            simulation->eat_counter = num;
        i++;
    }
    if (i == 5)
        simulation->eat_counter = -1;
    return (0);
}

void    ft_for_each_philo(t_simulation *simulation, t_philo *philo, int i)
{
    philo[i].index = i + 1;
    philo[i].left_hand = i;
    philo[i].right_hand = (i + 1) % simulation->philo_numbers;
    philo[i].is_dead = NO;
    if (simulation->eat_counter == -1)
        philo[i].eat_counter = -1;
    else
        philo[i].eat_counter = simulation->eat_counter;
}

t_philo *ft_philo_init(t_simulation *simulation)
{
    t_philo *philo;
    int     i;

    i = -1;
    philo = (t_philo *)malloc(sizeof(t_philo));
    while (++i < simulation->philo_numbers)
        ft_for_each_philo(simulation, philo, i);
    return (philo);
}

void    *ft_routine(void *arg)
{
    t_philo *philo;

    philo = (t_philo *)arg;
    printf("thread number %d has started\n", philo->index);
    sleep(1);
    printf("thread number %d has ended\n", philo->index);
    return (NULL);
}

int main(int ac, char **av)
{
    int             i;
    t_simulation    simulation;
    t_philo         *philo;

    i = 0;
    if (ac == 5 || ac == 6)
    {
        if (ft_parsing(av, &simulation))
            return (1);
        philo = ft_philo_init(&simulation);
        while (i < simulation.philo_numbers)
        {
            simulation.philo_index = i;
            pthread_create(simulation.threads + i, NULL,
                ft_routine, philo + i);
            i++;
        }
        i = 0;
        while (i < simulation.philo_numbers)
        {
            pthread_join(simulation.threads[i], NULL);
            i++;
        }
    }
    return (0);
}

DarkSide77
  • 719
  • 1
  • 4
  • 21
  • 2
    `while (av[i])` in `ft_parsing` looks dangerous to me. I don't think `argv` (in your case, `av`) is guaranteed to be NULL-terminated. `argc` (in your case, `ac`) is there to tell you how many arguments there are, use it. There are also [existing functions](https://man7.org/linux/man-pages/man3/strtol.3.html) that convert strings into numbers, you should use those unless you have a specific reason for avoiding them. – yano Jul 06 '21 at 19:22
  • Yes i can use only some function so that's why I didn't use `atoi` for example fo convert string to number ... – DarkSide77 Jul 06 '21 at 19:30
  • 1
    I'm wrong, learn something new everyday _Further, the array element at `argv[argc]` is a null pointer, so the array itself is also, in a sense, "null terminated_: https://stackoverflow.com/a/11020198/3476780 – yano Jul 06 '21 at 19:30
  • 1
    I didn't know `argv[argc]` was guaranteed to be a NULL pointer. `while(av[i])` will keep looping forever until that condition is false. If `argv[argc]` _wasn't_ NULL, your loop would keep on looping into memory you didn't own and who knows what would happen. But that's all a moot point now, you're fine using `while(av[i])`, the problem is elsewhere. – yano Jul 06 '21 at 19:35
  • Yeah, cuz I use this a lot so the problem is not in the code because it work fine in MacOS The problem in the VM configuration i think ... – DarkSide77 Jul 06 '21 at 19:38
  • the symptom your describe ("works fine here, not here") is a telltale sign of [undefined behavior](https://en.wikipedia.org/wiki/Undefined_behavior), and the problem is most certainly in your code, not the system or compiler. There are a [multitude of things](https://www.wikiod.com/w/C_Undefined_behavior) that can invoke UB, you'll simply need to examine your code to find it. Someone will find it, but I'm out of time at the moment. This is why posting an [mre] vs a code dump is preferred. – yano Jul 06 '21 at 19:55
  • 1
    Also strongly recommend running it with a debugger attached. It will stop at the line it crashes at, giving you the opportunity to examine variables and memory, shedding light on how the _real_ behavior of your code deviates from what you _think_ it is doing. – yano Jul 06 '21 at 19:58

1 Answers1

2

Your program was aborting on a pthread_create call.

But, the issue was a too short malloc call before that in ft_philo_init.

You were only allocating enough space for one t_philo struct instead of philo_numbers

Side note: Don't cast the return value of malloc. See: Do I cast the result of malloc?

Here is the corrected function:

t_philo *
ft_philo_init(t_simulation *simulation)
{
    t_philo *philo;
    int i;

    i = -1;

// NOTE/BUG: not enough elements allocated
#if 0
    philo = (t_philo *) malloc(sizeof(t_philo));
#else
    philo = malloc(sizeof(*philo) * simulation->philo_numbers);
#endif

    while (++i < simulation->philo_numbers)
        ft_for_each_philo(simulation, philo, i);

    return philo;
}

UPDATE:

thank you it's worked now, but can you explain why it work on macOS but it doesn't in kali? – DarkSide77

Well, it did not "work" on macOS either ...

When we index through an array and go beyond the bounds of the array, it is UB ("undefined behavior"). UB means just that: undefined behavior.

See:

  1. Undefined, unspecified and implementation-defined behavior
  2. Is accessing a global array outside its bound undefined behavior?

Anything could happen. That's because the philo array occupies a certain amount of memory. What is placed after that allocation? Let's assume philo is 8 bytes [or elements if you wish--it doesn't matter]:

| philo[8]                      | whatever                      |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F |

As long as we stay within bounds, things are fine (e.g.):

for (i = 0;  i < 8;  ++i)
    philo[i] = 23;

If we go beyond the end, we have UB (e.g.):

for (i = 0;  i < 9;  ++i)
    philo[i] = 23;

Here we went one beyond and modified the first cell of whatever.

Depending upon what variable was placed there by the linker, several behaviors are possible:

  1. The program seems to run normally.
  2. A value at whatever is corrupted and the program runs but produces incorrect results.
  3. whatever is aligned to a page that is write protected. The program will segfault on a protection exception immediately.
  4. The value corrupted at whatever has no immediate effect, but later the program detects the corruption.
  5. The corruption eventually causes a segfault because a pointer value was corrupted.

On both systems, your program was doing the same thing. For an area we get from malloc, the whatever is an internal struct used by the heap manager to keep track of the allocations. The program was corrupting this.

On macOS, the heap manager did not detect this. On linux (glibc), the heap manager did better cross checking and detected the corruption.

Craig Estey
  • 30,627
  • 4
  • 24
  • 48