0

I have been using to split the file content based on the character count, since one of my application which doesn't support for any number of characters. So I have to split the content based on the character count and update it multiple times based on the total character count.

c_max = 50000
f = File.new(filename)
u_count = (c_count / c_max.to_f).ceil
i = 1
while i <= u_count do
    u_characters = f.sysread(c_max)
    # do stuffs
    i+=1
end

But its not working when I use strings instead of filename.

content=File.read(filename)
#doing some stuffs on the contents 
irb(main):006:0> content.sysread(10)
NoMethodError: undefined method `sysread' for #<String:0x7f5f2eedd368>
        from (irb):6
        from :0
irb(main):007:0>
Karthi1234
  • 949
  • 1
  • 8
  • 28

2 Answers2

2

If you are trying to limit the number of characters IO#each_line can help you with this e.g.

c_max = 100
File.open(filename) do |file|
  file.each_line(c_max) do |characters|
    # characters is now a string that is 100 characters long
    # do something with characters
  end
end

Or to handle a String you can make it a StringIO and #each_line will work too (Please note String#each_line will not work as it only accepts a separator (String) and not a character limit (Fixnum) this is why we need StringIO)

s = "THIS IS A STRING"
StringIO.open(s) do |strio|
  strio.each_line(2) do |characters|
    # characters is now a string that is 2 characters long
    # do something with characters
  end
end    

So let's handle both options Update: (Based on comment discussion with @CarySwoveland - Thanks for pushing me further)

def do_stuff(line)
  # common functionality goes here 
  puts line
end

# return is a StringIO or File
# leaks file descriptor handle as you wish
def my_method(s,sep_or_char_limit=100)
  target = s.to_s # leverage a little duck typing
  target_class = File.file?(target) ? File : StringIO
  target_class.open(target) do |io|
    io.each_line(sep_or_char_limit, &method(:do_stuff))
  end
end 

Since this uses Enumerator functionality it will also help with memory consumption since the whole File need not be read into memory first or the whole String does not need to be split into a temporary Array.

There is additional hidden functionality here as well. You asked for limitation by characters but this will also allow for a separator if you prefer. e.g.

# Using a Character Limit
my_method("THIS IS A STRING",2)
# TH
# IS
#  I
# S
# A
# ST
# RI
# NG


# Using a separator 
my_method("THIS IS A STRING",' ')
# THIS 
# IS 
# A 
# STRING
engineersmnky
  • 25,495
  • 2
  • 36
  • 52
  • I like. A variant for the body of `my_method: `(File.file?(s) ? File.open(s) : StringIO.new(s)).each_line(sep_or_char_limit, &method(:do_stuff))`. This omits your `target = s.to_s`, as I didn't understand why that's needed. – Cary Swoveland Jun 06 '17 at 18:01
  • @CarySwoveland `File.open` suggests that I also need to `close` where as `foreach` will close the file on block termination. the `to_s` is there for edge cases say `my_method([1,2,3,4,5],',')` this will not fail because the true definition from a documentation standpoint is simply any object that has a public `to_s` method – engineersmnky Jun 06 '17 at 18:06
  • I considered that but the file will be closed when the method exits. I could of course begin with `target = s.to_s` and change `s` to `target` in what I suggested. – Cary Swoveland Jun 06 '17 at 18:07
  • @CarySwoveland no it will be marked for Garbage Collection and once it is collected it will be auto closed at that point and who knows when that will be [See Here](https://stackoverflow.com/a/4795782/1978251) from the venerable Jorg W Mittag – engineersmnky Jun 06 '17 at 18:09
  • Ah, I didn't know the file isn't closed right away. Without `close` would the object `StringIO.new(target)` be garbage-collected, rather than released right away? If so, and the amount of memory is significant, perhaps `target = s.to_s; obj = File.file?(target) ? File.open(target) : StringIO.new(target); obj.each_line(sep_or_char_limit, &method(:do_stuff)); obj.close`. – Cary Swoveland Jun 06 '17 at 18:27
  • @CarySwoveland while it is not "Closed" closing it has no impact since closing it does not release the reference to the accumulated String we could replace this portion with `StringIO.open(target) { |strio| strio.each_line(sep_or_char_limit, &method(:do_stuff))}` which will ensure it is closed (editted for consistency sake) but will have no impact on memory consumption so it's sort six one half dozen another. I definitely understand what you are getting at but I think I still prefer the separation of concerns here from a readable and understandable standpoint. – engineersmnky Jun 06 '17 at 18:42
  • @CarySwoveland updated just for you both `File` and `StringIO` without duplication. – engineersmnky Jun 06 '17 at 18:52
  • 1
    In your update I like your use of `open` (didn't know about that) and the fact that you made the only difference the class, rather than the method. – Cary Swoveland Jun 06 '17 at 19:06
  • 1
    @CarySwoveland I aim to please and our discussion made for a better technical implementation (DRY and consistent return value). Thanks :) – engineersmnky Jun 06 '17 at 19:09
1

The problem is that a String object doesn't have a sysread method. To get the first n characters of a String object you can use String slice(range):

content[0...10]

or slice(start, length):

content[0, 10]
the Tin Man
  • 158,662
  • 42
  • 215
  • 303