Questions tagged [extract]

Questions related to retrieving specific information from a (typically minimally structured) data source, such as a web site, media file, source code collection or compressed archive (in which case the desired information is one or more original, uncompressed files). When using this tag, please include additional tags to clarify which specific environment/language/scenario your question refers to.

Data extraction is a term with many different but related meanings, including:

  • Parsing files (such as HTML pages) or file metadata in order to obtain certain information. This often involves

  • Retrieving single frames from audio, video or image files

  • Breaking up functionality in a single source code unit (e.g. a function) into multiple units:

  • Retrieving the original files from a (optionally compressed) archive file, such as a .zip or .tar file.

and should be added as a synonym for this tag.

6876 questions
1093
votes
13 answers

How to get first N number of elements from an array

I am working with Javascript(ES6) /FaceBook react and trying to get the first 3 elements of an array that varies in size. I would like do the equivalent of Linq take(n). In my Jsx file I have the following: var items = list.map(i => { return ( …
user1526912
  • 15,818
  • 14
  • 57
  • 92
665
votes
11 answers

The difference between bracket [ ] and double bracket [[ ]] for accessing the elements of a list or dataframe

R provides two different methods for accessing the elements of a list or data.frame: [] and [[]]. What is the difference between the two, and when should I use one over the other?
Sharpie
  • 17,323
  • 4
  • 44
  • 47
320
votes
20 answers

How do you extract a column from a multi-dimensional array?

Does anybody know how to extract a column from a multi-dimensional array in Python?
jaweria
256
votes
15 answers

How to extract all values from a dictionary in Python?

I have a dictionary d = {1:-0.3246, 2:-0.9185, 3:-3985, ...}. How do I extract all of the values of d into a list l?
Naveen C.
  • 3,185
  • 5
  • 21
  • 12
228
votes
2 answers

How do I move a Git branch out into its own repository?

I have a branch that I'd like to move into a separate Git repository, and ideally keep that branch's history in the process. So far I've been looking at git filter-branch, but I can't make out whether it can do what I want to do. How do I extract a…
Aupajo
  • 5,885
  • 6
  • 30
  • 28
204
votes
7 answers

How can I extract the folder path from file path in Python?

I would like to get just the folder path from the full path to a file. For example T:\Data\DBDesign\DBDesign_93_v141b.mdb and I would like to get just T:\Data\DBDesign (excluding the \DBDesign_93_v141b.mdb). I have tried something like…
Genspec
  • 2,279
  • 2
  • 14
  • 10
196
votes
6 answers

Accessing last x characters of a string in Bash

I found out that with ${string:0:3} one can access the first 3 characters of a string. Is there a equivalently easy method to access the last three characters?
aldorado
  • 4,394
  • 10
  • 35
  • 46
187
votes
15 answers

How to extract text from a PDF?

Can anyone recommend a library/API for extracting the text and images from a PDF? We need to be able to get at text that is contained in pre-known regions of the document, so the API will need to give us positional information of each element on the…
Budda007
  • 1,903
  • 2
  • 12
  • 3
183
votes
9 answers

Extract first item of each sublist in Python

I'm wondering what is the best way to extract the first item of each sublist in a list of lists and append it to a new list. So if I have: lst = [[a,b,c], [1,2,3], [x,y,z]] And, I want to pull out a, 1 and x and create a separate list from those. I…
konrad
  • 3,544
  • 4
  • 36
  • 75
178
votes
17 answers

How to get the first word of a sentence in PHP?

I want to extract the first word of a variable from a string. For example, take this input: The resultant output should be Test, which is the first word of the input. How can I do this?
ali
  • 1,847
  • 2
  • 12
  • 10
169
votes
17 answers

How to extract one column of a csv file

If I have a csv file, is there a quick bash way to print out the contents of only any single column? It is safe to assume that each row has the same number of columns, but each column's content would have different length.
user788171
  • 16,753
  • 40
  • 98
  • 125
163
votes
15 answers

Javascript - How to extract filename from a file input control

When a user selects a file in a web page I want to be able to extract just the filename. I did try str.search function but it seems to fail when the file name is something like this: c:\uploads\ilike.this.file.jpg. How can we extract just the file…
Yogi Yang 007
  • 5,147
  • 10
  • 56
  • 77
137
votes
5 answers

Get string after character

I have a string that looks like this: GenFiltEff=7.092200e-01 Using bash, I would like to just get the number after the = character. Is there a way to do this?
user788171
  • 16,753
  • 40
  • 98
  • 125
115
votes
24 answers

Extract images from PDF without resampling, in python?

How might one extract all images from a pdf document, at native resolution and format? (Meaning extract tiff as tiff, jpeg as jpeg, etc. and without resampling). Layout is unimportant, I don't care were the source image is located on the page.
matt wilkie
  • 17,268
  • 24
  • 80
  • 115
106
votes
4 answers

What algorithm does Readability use for extracting text from URLs?

For a while, I've been trying to find a way of intelligently extracting the "relevant" text from a URL by eliminating the text related to ads and all the other clutter.After several months of researching, I gave it up as a problem that cannot be…
user300981
  • 1,423
  • 5
  • 13
  • 16
1
2 3
99 100