1

I am attempting to subset a Vector{String} in Julia using a combination of Integer and Vector{Integer} subset values. I want to write a function that basically allows for a subsetting of "asdf"[1:3] with each of the three arguments x[y:z] to be either vectors or singletons.

This is what I have attempted so far:

function substring(x::Array{String}, y::Integer, z::Integer)
  y = fill(y, length(x))
  z = fill(z, length(x))
  substring(x, y, z)
end

function substring(x::Vector{String}, y::Vector{Integer}, z::Integer)
  y = fill(y, length(x))
  substring(x, y, z)
end

function substring(x::Vector{String}, y::Integer, z::Vector{Integer})
  z = fill(z, length(x))
  substring(x, y, z)
end

function substring(x::Vector{String}, y::Vector{Integer}, z::Vector{Integer})
  for i = 1:length(x)
    x[i] = x[i][y[i]:min(z[i], length(x[i]))]
    # If z[i] is greater than the length of x[i] 
    # return the end of the string
  end
  x
end

Attempting to use it:

v = string.('a':'z')
x = rand(v, 100) .* rand(v, 100) .* rand(v, 100)

substring(x, 1, 2)
# or
substring(x, 1, s)

I get the error:

MethodError: no method matching substring(::Array{String,1}, ::Int64, ::Array{Int64,1})
Closest candidates are:
  substring(::Array{String,N}, ::Integer, !Matched::Integer) at untitled-e3b9271a972031e628a35deeeb23c4a8:2
  substring(::Array{String,1}, ::Integer, !Matched::Array{Integer,1}) at untitled-e3b9271a972031e628a35deeeb23c4a8:13
  substring(::Array{String,N}, ::Integer, !Matched::Array{Integer,N}) at untitled-e3b9271a972031e628a35deeeb23c4a8:13
  ...
 in include_string(::String, ::String, ::Int64) at eval.jl:28
 in include_string(::Module, ::String, ::String, ::Int64, ::Vararg{Int64,N}) at eval.jl:32
 in (::Atom.##53#56{String,Int64,String})() at eval.jl:50
 in withpath(::Atom.##53#56{String,Int64,String}, ::Void) at utils.jl:30
 in withpath(::Function, ::String) at eval.jl:38
 in macro expansion at eval.jl:49 [inlined]
 in (::Atom.##52#55{Dict{String,Any}})() at task.jl:60

I see that there is another post addressing a similar error with type Vector{String}. My post also ques a response to the error associated with the Vector{Integer}. I believe the responses to it might be helpful for others like me who find the implementation of abstract types novel and difficult.

Community
  • 1
  • 1
Francis Smart
  • 3,875
  • 6
  • 32
  • 58
  • 2
    Possible duplicate of [Vector{AbstractString} function parameter won't accept Vector{String} input in julia](http://stackoverflow.com/questions/21465838/vectorabstractstring-function-parameter-wont-accept-vectorstring-input-in-j) – Fengyang Wang Apr 11 '17 at 18:38
  • 2
    This is an example of parametric invariance. See http://stackoverflow.com/questions/21465838/vectorabstractstring-function-parameter-wont-accept-vectorstring-input-in-j for s similar problem; here your problem is in `Vector{Integer}`. – Fengyang Wang Apr 11 '17 at 18:38
  • I would like to note that while my question has been interpreted as a type management problem. I am really just looking for a function that does what the title says. In R for which I am more accustomed, the answer is trivial, `substr(x,1,2)`. I included the code above to show that I have made a reasonable effort at solving the issue myself.... And if it was not too much trouble I would really appreciate an answer. – Francis Smart Apr 11 '17 at 21:04
  • 1
    Sure, I've added an answer that addresses your actual problem (in Julia it is not so different from in R). – Fengyang Wang Apr 11 '17 at 21:14

2 Answers2

2

If you're on Julia 0.6, this is pretty easy to do using SubString.(strs, starts, ends):

julia> SubString.("asdf", 2, 3)
"sd"

julia> SubString.(["asdf", "cdef"], 2, 3)
2-element Array{SubString{String},1}:
 "sd"
 "de"

julia> SubString.("asdf", 2, [3, 4])
2-element Array{SubString{String},1}:
 "sd" 
 "sdf"

On Julia 0.5, you can do the same thing, but you must wrap the string in a vector (i.e. it cannot be left as a single scalar):

julia> SubString.(["asdf"], [1, 2, 3], [2, 3, 4])
3-element Array{SubString{String},1}:
 "as"
 "sd"
 "df"

The main difference between Julia and R is that while in R, functions are typically made to work on vectors by default (broadcasted), in Julia you explicitly specify the broadcasting behavior by using a so-called "dot-call", i.e. f.(x, y, z).

Fengyang Wang
  • 11,901
  • 2
  • 38
  • 67
1

Just to make that explicit as think its a super common thing to think.

Even though Int64 <: Integer is true

Array{Int64,1} <: Array{Integer,1} is not!


The docs on parametric-composite-types explain why in detail. But to paraphrase its basically because the former Array{Int64,1} has a specific representation in memory (i.e. many contiguous 64 bit values) while the Array{Integer,1} has to be sets of pointers to separately allocated values that may or may not be 64 bits.

See the similar Q&A for the cool new syntax you can use for declaring functions in julia 0.6 w/regard to this: Vector{AbstractString} function parameter won't accept Vector{String} input in julia

Community
  • 1
  • 1
Alexander Morley
  • 4,111
  • 12
  • 26