-3

I used beautiful soup for web scraping. By using the below code, I extracted info about the years:

enter image description here

y = [elem.string.strip("\n") for elem in years]

I obtained this list

enter image description here

So, in order to remove the square brackets, commas and the quotation marks, I converted it to string and wrote this code:

y=str(y)[1:-1].replace(" ' ", " ").replace(",", " ")

but the output displays all the elements horizontally, like this:

'1901 1902 1903 1904 1905 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2001 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019'

Whereas, I want it to be displayed vertically. What am I doing wrong?

  • What are you trying to achieve where you need a "vertical" list? The formatting in your question is not great, so it's not obvious whether this is the case, but maybe you have a string and would like to have it as a list instead? – lucidbrot Apr 19 '21 at 06:11
  • @lucidbrot, I am extracting info about the nobel prize winners from the Wikipedia page. Ultimately, I want to add the obtained info about years to a data frame and save it as a csv file. And, I apologize for the formatting of the question. I am new to StackOverflow and this is my first question. –  Apr 19 '21 at 06:18
  • Welcome to SO then :) I've made an attempt at an answer. Please clarify if I made wrong assumptions. (Btw, this is likely also the reason for the downvotes you're getting: It's not clear what exactly you are asking for.) – lucidbrot Apr 19 '21 at 06:51
  • Barsha Thakur, welcome to Stackoverflow! Don't mind the downvotes as some people can be rather narrow-minded and not able think outside the box. @lucidbro supplied a good answer. – IODEV Apr 19 '21 at 07:17
  • Btw, in Stockoverflow, it is customary to upvote and check the box for the correct answer in order to give @lucidbrot the credit he deserves. To mark an answer as accepted, click on the check mark beside the answer to toggle it from greyed out to filled in. – IODEV Apr 19 '21 at 07:30
  • 1
    @IODEV okay, thank you for letting me know that. I am sorry, I am still trying to figure out this platform. It's just that, I was stuck on this problem and was desperate to figure out the solution. –  Apr 19 '21 at 07:33
  • FYI it's __scraping__ (and __scrape__, __scraped__, __scraper__) not scrapping – DisappointedByUnaccountableMod Apr 20 '21 at 16:10
  • @barny I know it's scraping. By mistake, I wrote scrapping thanks for correcting that but adding a comment for a spelling mistake really seems unnecessary. –  Apr 21 '21 at 05:15

1 Answers1

1

For later saving it to a csv file you don't really care about how it looks like when printed, only how python sees it and works with it. There are various ways of writing to csv, but most of them will require something that can be iterated over - usually a list.

What you currently have is a string, I think. You can verify that with type(y). If it says <class 'str'>, it's a string.

So you can split the string on the spaces to get a list.

# example
y = '1901 1902 1903 1904'
the_list = y.split()
# output: ['1901', '1902', '1903', '1904']

If that is in fact not what you want, but you actually want it to be printed vertically, you would need to add newlines to the end of each number. One way to do so is taking the list of numbers as generated above and "joining" them.

\n stands for the newline character.

string_with_newlines = '\n'.join(the_list)
print(string_with_newlines)
lucidbrot
  • 5,378
  • 3
  • 39
  • 68
  • 2
    Thank you for the answer. The second line of codes (string_with_newlines = '\n'.join(the_list)) helped me to get the results I want. Basically, I do not need to convert the obtained list to string instead I should have used, string_with_newlines = '\n'.join(y). But, I did that to remove the square brackets from the list , the comma between each element and the quotation marks around them. –  Apr 19 '21 at 07:19
  • 1
    @Barsha Thakur: in Stockoverflow, it is customary to upvote and check the box for the correct answer in order to give Lucidbrot the credit he deserves. To mark an answer as accepted, click on the check mark beside the answer to toggle it from greyed out to filled in. – IODEV Apr 19 '21 at 07:31