2

I am trying to achieve the simplest task - creating the FISCAL_YEAR column as shown below (column YEAR is given):

+------+-------------+
| YEAR | FISCAL_YEAR |
+------+-------------+
| 2022 | 2022-2023   |
+------+-------------+
| 2022 | 2022-2023   |
+------+-------------+
| 2022 | 2022-2023   |
+------+-------------+
| 2022 | 2022-2023   |
+------+-------------+

I keep getting the error: can only concatenate str (not "int") to str

These are the steps I've tried so far, without success:

  1. df['fiscal_year'] = str(df['year']) + "-" + str(df['year']+1)

  2. df['fiscal_year'] = df['year'].astype(str) + "-" + (df['year']+1).astype(str)

df['year_str'] = pd.Series(df['year'], dtype=pd.StringDtype())

And also:

df['year_str'] = df['year'].astype(str)

And then:

df['year_str'].str.cat(df['year_str'].astype(int) + 1, sep='-')

None of these options work. Is there is anything else I'm missing?

** I am on Windows 10 and Python version 3.9.7

wjandrea
  • 28,235
  • 9
  • 60
  • 81
SAR
  • 180
  • 1
  • 9
  • 1
    `str(df['year'])` obviously won't work since that's the str of the Series, not a Series of strs. Though maybe you already know that. – wjandrea Apr 06 '23 at 15:05
  • What dtype is `YEAR`? It looks like it's already str, so `df['year']+1` is failing. – wjandrea Apr 06 '23 at 15:08
  • `.str.cat(df['year_str'].astype(int) + 1, ...)` also obviously won't work since you're passing in ints. – wjandrea Apr 06 '23 at 15:08
  • @wjandrea, actually `df['year']+1` in itself works fine. might it be doing an implicit casting? The issue was with the concatenation. The solution of the person below get it to work though. Thanks! – SAR Apr 06 '23 at 15:11
  • Implicit casting? in Python? no, never. Python's strongly-typed and Pandas follows that. So if `year` is dtype int, I can't reproduce the issue. Option 2 works perfectly. Make a [mre]; the issue might not be where you expect. Like, you might have made a typo somewhere and don't need to do three casts in the end. See also: [How to make good reproducible pandas examples](/q/20109391/4518341) and [Why should I post complete errors?](https://meta.stackoverflow.com/q/359146/4518341) – wjandrea Apr 06 '23 at 15:16
  • 1
    As a few comments have hinted, the error is probably because the `YEAR` column is unexpectedly type `str`. If you wrote a [minimal, complete, verifiable example](https://stackoverflow.com/help/minimal-reproducible-example), i.e., including code to construct a minimal, problematic version of `df`, you probably would have discovered this for yourself. Often writing a MVCE for StackOverflow (or a colleague) solves the problem without needing to post the question. – Matthias Fripp Apr 06 '23 at 15:20
  • 1
    @Matthias FYI, you can write `[mre]` in a comment and it expands to *[mre]*. More shorthands listed here: [comment formatting help](/editing-help#comment-formatting) – wjandrea Apr 06 '23 at 15:21

1 Answers1

1

A failproof way could be:

df['FISCAL_YEAR'] = (df['YEAR'].astype(str)
                     +'-'+
                     df['YEAR'].astype(int).add(1).astype(str)
                    )

Output:

   YEAR FISCAL_YEAR
0  2022   2022-2023
1  2022   2022-2023
2  2022   2022-2023
3  2022   2022-2023
mozway
  • 194,879
  • 13
  • 39
  • 75
  • 1
    This is unbelievably counter-intuitive solution, but it works! Thank you! – SAR Apr 06 '23 at 15:02
  • 3
    @SAR if you know the original type for sure, you can get rid of either the `.astype(str)` or the `.astype(int)`, one them is useless, which one depends on the original type ;) – mozway Apr 06 '23 at 15:03
  • 1
    You're right. Removing `astype(str)` from `df['YEAR'].astype(str)` didn't matter and solution still worked fine. – SAR Apr 06 '23 at 21:57