1

I am trying to do some array manipulations. I am doing char array sorting and duplicates removal here. Your comments are welcome. Havent done much testing and error handling here though.

#include<stdafx.h>
#include<stdlib.h>
#include<stdio.h>
#include<string>
using namespace std;

void sort(char *& arr)
{
    char temp;
    for(int i=0;i<strlen(arr);i++)
    {
        for(int j=i+1;j<strlen(arr);j++)
        {
            if(arr[i] > arr[j])
            {
            temp = arr[i];
            arr[i] = arr[j];
            arr[j] = temp;
            }
        }
    }

}
bool ispresent(char *uniqueArr, char * arr)
{
    bool isfound = false;
    for(int i=0;i<strlen(arr);i++)
    {
    for(int j=0;j<=strlen(uniqueArr);j++)
    {
        if(arr[i]== uniqueArr[j])
        {
        isfound = true;
        return isfound;
        }
        else
        isfound = false;
    }
    }

    return isfound;
}

char * removeduplicates(char *&arr)
{
    char * uniqqueArr = strdup(""); // To make this char array modifiable
    int index = 0;
    bool dup = false;
    while(*arr!=NULL)
    {       
     dup = ispresent(uniqqueArr, arr);
     if(dup == true)
     {}//do nothing
     else// copy the char to new char array.
     {
           uniqqueArr[index] = *arr;    
     index++;
     }
    arr++;
    }
    return uniqqueArr;
}
int main()
{
    char *arr = strdup("saaangeetha"); 
    // if strdup() is not used , access violation writing to 
          //location occurs at arr[i] = arr[j]. 
    //This makes the constant string modifiable
    sort(arr);
    char * uniqueArr = removeduplicates(arr);   

}
user457660
  • 91
  • 1
  • 3
  • 9
  • 1
    Why why why `char*`? Why why why not `std::string` or `std::vector`? – Nawaz Mar 22 '11 at 19:04
  • @user457660: Read my comment which I wrote in response to yours to my solution! – Nawaz Mar 22 '11 at 20:19
  • @user457660: I created another topic discussing few things, inspired from this topic. Here is the link : http://stackoverflow.com/questions/5397616/what-is-wrong-with-stdset – Nawaz Mar 22 '11 at 21:23

3 Answers3

7

If you use std::string, your code (which is actually C-Style) can be written in C++ Style in just these lines:

#include <iostream>
#include <string>
#include <algorithm>

int main() {
        std::string s= "saaangeetha";
        std::sort(s.begin(), s.end());
        std::string::iterator it = std::unique (s.begin(), s.end()); 
        s.resize( it - s.begin());
        std::cout << s ;
        return 0;
}

Output: (all duplicates removed)

aeghnst

Demo : http://ideone.com/pHpPh

If you want char* at the end, then you can do this:

   const char *uniqueChars = s.c_str(); //after removing the duplicates!
Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • 1
    +1 for `std::string`. @user If you are using c++, why not use the STL? it is there to make it easier and to not have to reinvent the wheel. (this is of course, unless you have a char * requirement) – Sagar Mar 22 '11 at 19:11
  • 1
    @Nawaz, I would also do the same way as you suggested in my work. But I am trying to learn the concepts here and so wanted to get hands on with char *, sort and duplicates myself. – user457660 Mar 22 '11 at 20:08
  • 1
    @user457660: Alright. In that case, I think `char * uniqqueArr = strdup("")` is not correct, as it doesn't allocate memory as much as you need. Better do : `char * uniqqueArr = new char[strlen(arr)];`. So that you can write `uniqqueArr[index]` where `index < strlen(arr)`. – Nawaz Mar 22 '11 at 20:17
  • @user457660: I created another topic discussing few things, inspired from this topic. Here is the link : http://stackoverflow.com/questions/5397616/what-is-wrong-with-stdset – Nawaz Mar 22 '11 at 21:22
2

If I were doing it, I think I'd do the job quite a bit differently. If you can afford to ignore IBM mainframes, I'd do something like this:

unsigned long bitset = 0;

char *arr = "saaangeetha";
char *pos;

for (pos=arr; *pos; ++pos) 
    if (isalpha(*pos))
        bitset |= 1 << (tolower(*pos)-'a');

This associates one bit in bitset with each possible letter. It then walks through the string and for each letter in the string, sets the associated bit in bitset. To print out the letters once you're done, you'd walk through bitset and print out the associated letter if that bit was set.

If you do care about IBM mainframes, you can add a small lookup table:

static char const *letters = "abcdefghijklkmnopqrstuvwxyz";

and use strchr to find the correct position for each letter.

Edit: If you're using C++ rather than C (as the tag said when I wrote what's above), you can simplify the code a bit at the expense of using some extra storage (and probably being minutely slower):

std::string arr = "saaangeetha";

std::set<char> letters((arr.begin()), arr.end());

std::copy(letters.begin(), letters.end(), std::ostream_iterator<char>(std::cout, " "));

Note, however, that while these appear the same for the test input, they can behave differently -- the previous version screens out anything but letters (and converts them all to lower case), but this distinguishes upper from lower case, and shows all non-alphabetic characters in the output as well.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • "If you can afford to ignore IBM mainframes" and certain modern DSPs. – Steve Jessop Mar 22 '11 at 19:51
  • @Steve: what DSP would use a character set in which letters weren't contiguous? – Jerry Coffin Mar 22 '11 at 19:52
  • sorry, I'm stupid, I completely misunderstood what you were talking about. I thought there was an assumption in there about byte size too, but now you mention it I'm pretty sure there are still 26 letters in the alphabet regardless. – Steve Jessop Mar 22 '11 at 22:54
  • @Steve Jessop: Yes -- the problem on IBM mainframes is that in EBCDIC there are other characters stuck in the middle of the alphabet (well, not exactly the middle, but at two spots, ~1/3rd and 2/rds of the way through) which can throw off indexing and such. – Jerry Coffin Mar 22 '11 at 22:57
1
char *arr = "saangeetha";

arr is pointing to read only section where string literal saangeetha is stored. So, it cannot be modified and is the reason for access violation error. Instead you need to do -

char arr[] = "sangeetha"; // Now, the string literal can be modified because a copy is made.
Mahesh
  • 34,573
  • 20
  • 89
  • 115