Questions tagged [go-colly]

colly is a web scraping framework written in Go. Import it as https://github.com/gocolly/colly. You will typically use this tag together with the main tag [go].

63 questions

votes

1 answer

Go Colly not returning any data from website

I am trying to make a simple web scraper in go and I can't seem to get the most simple functionality from colly. I took the basic example from the colly docs and while it worked with the hackernews.org site they used it isn't working with the site I…

go web-scraping go-colly

asked Dec 25 '21 at 09:19

Cade

votes

1 answer

add colly package output text to map in golang

i was making a web scraper with colly package, where it collects the ContestName and ContestTime from a website and make a json file. so i did like this Contests := make(map[string]map[string]map[string]map[string]string) …

json go web-scraping go-map go-colly

asked Nov 15 '22 at 12:26

Vinay Kumar Rasala

votes

1 answer

Get values from same class name values in colly web scraping

i am working on small web scraping application using go language and colly web scraping framework which is built in Go here is the html code of website

go go-colly

asked Oct 20 '21 at 14:13

Dinesh s

votes

0 answers

Passing cookies from Go Rod (Headless browser) to requests, Colly cookiejar

I am trying to pass cookies from a headless browser in golang to the requests package cookiejar. There are some JS generated cookies that I need to grab using the headless broswer and then pass to the requests module. I currently have this to export…

json go cookies cookiejar go-colly

asked Oct 05 '21 at 01:04

AntBox

votes

1 answer

How to use selectors properly

I'm writing a crawler to retrieve some data from some pages, the logic of how to build it is very clear for me but I am very confused in how to use the selectors properly. I would like to get the title of some news using colly, I went to the page…

go web-scraping web-crawler go-colly

asked Nov 09 '20 at 23:25

MrByte

votes

1 answer

how to ignore printing Max depth limit reached go colly

i have a go colly crawler that i am trying to crawl many sites . on my terminal it prints a lot of : 2023/05/30 02:22:56 Max depth limit reached 2023/05/30 02:22:56 Max depth limit reached 2023/05/30 02:22:56 Max depth limit reached 2023/05/30…

go go-colly

asked May 29 '23 at 22:53

Farshad

1,830
6
38
70

votes

1 answer

Scraping all possible tags and putting them into one variable using Go Colly

I need to scrape different tags from a list of sites, put in variable and then put them in a .csv list. For example, all lines where the author of the article is mentioned (div.author, p.author etc). On all sites, the location of this line and the…

go web-scraping go-colly

asked Apr 05 '23 at 14:02

Maxim Zhukotsky

votes

1 answer

Max Rate limit of StackOverflow

I have been trying to access StackOverflow with the amount of 30 requests / second but it not working. It has been blocked after a few seconds. Although the document of StackOverflow said the max rate limit of StackExchange is 30 req /s. The…

go go-colly

asked Jan 09 '23 at 07:41

Hiếu Nguyễn Trung

votes

1 answer

Web scrapping using Golang Colly, How to handle XML path not found?

I am using Colly for scrapping an ecommerce website. I will loop over many products. Here is a snippet of my code getting a sub-title c.OnXML("/html/body/div[4]/div/div[3]/div[2]/div/div[1]/div[3]/div/div/h1/1234", func(e *colly.XMLElement) { …

xml go go-colly

asked Dec 29 '22 at 10:01

Chau Loi

1,106
1
14
36

votes

1 answer

Go Colly how to find requested element?

I'm trying to get specific table to loop through its content using colly but table its not being recognized, here's what I have so far. package main import ( "fmt" "github.com/gocolly/colly" ) func main() { c :=…

go web-scraping go-colly

asked Dec 28 '22 at 14:32

Lynx

votes

1 answer

How do I scrape TLS certificates using go-colly?

I am using Colly to scrape a website and I am trying to also get the TLS certificate that the site is presenting during the TLS handshake. I looked through the documentation and the response object but did not find what I was looking for. According…

go ssl web-scraping go-colly

asked Jul 08 '22 at 22:34

user234980238402

votes

1 answer

Go Colly parallelism decreases the number of links scraped

I am trying to build a web scrapper to scrape jobs from internshala.com. I am using go colly to build the web scrapper. I visit every page and then visit the subsequent links of each job to scrape data from. Doing this in a sequential manner scrapes…

go web-scraping web-crawler go-colly

asked May 17 '22 at 08:12

Adnan

votes

0 answers

Web scraping site using polymerjs / webcomponent

I'm using colly to web scrape youtube charts. This site use polymerjs and as a result, I'm having issues to capture the DOM elements. A simple test I did was document.querySelector("#search-native") on console, and it's returning null. I saw an…

web-scraping polymer web-component go-colly

asked Apr 26 '22 at 21:45

Jess

votes

1 answer

What can the go-colly library do?

Can the go-colly library crawl all HTML tags and text content under a div tag? If so, how? I can get all texts under a div tag. Like this: c.OnHTML("body .post-topic-main .post-topic-des", func(e *colly.HTMLElement) { text =…

go go-colly

asked Apr 07 '22 at 09:38

N Fx

votes

1 answer

Parsing nested elements using go-colly scraper

I'm using go-colly to scrape data from a webpage: I'm unable to parse out the src image from this nested HTML element. c.OnHTML(".result-row", func(e *colly.HTMLElement) { qoquerySelection := e.DOM …

go web-scraping go-colly

asked Mar 31 '22 at 02:09

Ryan

1,102
1
15
30

2 3 4 5 Next