1

I have a data frame with columns, say v1~v4

| _NAME    | _TIMESTAMP          | v0    | v1   | v2    | v3    | v4    |
|----------|---------------------|-------|------|-------|-------|-------|
| BRAKE_LH | 17-11-2021 22:50:43 | 13896 | 8262 | 12339 | 13110 | 13107 |
| BRAKE_LH | 17-11-2021 22:51:34 | 13896 | 8262 | 12339 | 13110 | 13107 |
| BRAKE_LH | 17-11-2021 22:51:35 | 13896 | 8262 | 12339 | 13110 | 13107 |
| BRAKE_LH | 17-11-2021 22:51:36 | 13896 | 8262 | 12339 | 13110 | 13107 |
| BRAKE_LH | 17-11-2021 22:51:37 | 0     | 0    | 0     | 0     | 0     |  

If I want to do the below function to the columns v1~v4

df['v0'] = df['v0'].apply(lambda x: chr(round(x / 256)) + chr(x % 256)).apply(lambda x: x[::-1])
df['v1'] = df['v1'].apply(lambda x: chr(round(x / 256)) + chr(x % 256)).apply(lambda x: x[::-1])
df['v2'] = df['v2'].apply(lambda x: chr(round(x / 256)) + chr(x % 256)).apply(lambda x: x[::-1])
df['v3'] = df['v3'].apply(lambda x: chr(round(x / 256)) + chr(x % 256)).apply(lambda x: x[::-1])
df['v4'] = df['v4'].apply(lambda x: chr(round(x / 256)) + chr(x % 256)).apply(lambda x: x[::-1])

In come cases the columns goes beyond 4 columns, say 40 or 100 columns

Is there a simple way to apply it for all columns, except--> _NAME & _TIMESTAMP columns

user_v27
  • 423
  • 4
  • 12
  • Provide your input data as code. – hpchavaz Nov 18 '21 at 17:37
  • @hpchavaz: modified – user_v27 Nov 18 '21 at 17:51
  • Does this answer your question? [How to exclude and filter few columns in pandas?](https://stackoverflow.com/questions/56009504/how-to-exclude-and-filter-few-columns-in-pandas) – Алексей Р Nov 18 '21 at 19:05
  • you could do: for col in df.columns: df[col] = df[col].apply( ...) tip: it would be cleaner if you define a function and call it in your lambda expression – Oliver Prislan Nov 18 '21 at 19:11
  • As a side note: seen your formula has a part of dividing by 256 and the other part taking the remainder of division by using %. This is most commonly used with the first part using integer division, i.e `chr(x // 256)` instead of rounding the result of `x / 256`. See whether this is applicable to your use case. The problem of rounding is you would get a value of 1 bigger when the division result has a fractional part >= 0.5 It would not be that meaningful if you also take the remainder together with this rounded value. – SeaBean Nov 19 '21 at 08:39

2 Answers2

2

You can set columns _NAME and _TIMESTAMP as index (to exclude them for processing) by .set_index(). Then use .applymap() to use your formulas for processing elementwise on each column. Finally, restore the columns _NAME and _TIMESTAMP to data columns by .reset_index(), as follows:

df.set_index(['_NAME', '_TIMESTAMP']).applymap(lambda x: chr(round(x / 256)) + chr(x % 256)).applymap(lambda x: x[::-1]).reset_index()

Result:

      _NAME           _TIMESTAMP  v0  v1  v2  v3  v4
0  BRAKE_LH  17-11-2021 22:50:43  H6  F   30  63  33
1  BRAKE_LH  17-11-2021 22:51:34  H6  F   30  63  33
2  BRAKE_LH  17-11-2021 22:51:35  H6  F   30  63  33
3  BRAKE_LH  17-11-2021 22:51:36  H6  F   30  63  33
4  BRAKE_LH  17-11-2021 22:51:37  
SeaBean
  • 22,547
  • 3
  • 13
  • 25
  • I get this error. raise TypeError(f"cannot convert the series to {converter}") TypeError: cannot convert the series to – user_v27 Nov 19 '21 at 05:02
  • I think its because of one particular column in my other data frame which had -ve values, so only when i had to delete the column and tried it worked – user_v27 Nov 19 '21 at 06:01
  • @user_v27 That's right, your formula with the call to `chr()` should accept only positive integers. However, the error `TypeError: cannot convert the series to ` should occur only when you use `.apply()` . Using `.applymap()` as I suggested should not give this error since we are processing element by element instead of processing a Pandas Series at a time. – SeaBean Nov 19 '21 at 06:12
0

You can put the columns you want to ignore in a set, IGNORELIST = {'_NAME', '_TIMESTAMP'}.

Then iterate through the column names and check whether a name is ignored or not. If it's not, apply your functions.

here's an example

# df = ..your dataframe..

IGNORELIST = {'colname1', 'colname2'}

for colname in df.columns:
  if not colname in IGNORELIST:
    df[colname] = df[colname].apply(lambda x: chr(round(x / 256)) + chr(x % 256)).apply(lambda x: x[::-1])
Queuebee
  • 651
  • 1
  • 6
  • 24
  • I dont want to create new columns, the columns are already available in data frame. I want to apply the function to all columns in a given data frame except in the ignore list (_NAME, _TIMESTAMP) – user_v27 Nov 18 '21 at 18:31
  • It was only an example, I assumed you had your own dataframe since you used `df` in your code. I'll remove the example dataframe. – Queuebee Nov 18 '21 at 21:27