5

Is there a way to build a defrecord with lots of fields? It appears there is a limit of around 122 fields, as this gives a "Method code too large!" error:

(defrecord WideCsvFile
   [a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15 a16 a17 a18 a19
    a20 a21 a22 a23 a24 a25 a26 a27 a28 a29 a30 a31 a32 a33 a34 a35 a36 a37 a38 a39
    a40 a41 a42 a43 a44 a45 a46 a47 a48 a49 a50 a51 a52 a53 a54 a55 a56 a57 a58 a59
    a60 a61 a62 a63 a64 a65 a66 a67 a68 a69 a70 a71 a72 a73 a74 a75 a76 a77 a78 a79
    a80 a81 a82 a83 a84 a85 a86 a87 a88 a89 a90 a91 a92 a93 a94 a95 a96 a97 a98 a99
    a100 a101 a102 a103 a104 a105 a106 a107 a108 a109 a110 a111 a112 a113 a114 a115 a116 a117 a118 a119
    a120 a121 a122])

while removing any of the fields allows record creation.

Brian
  • 967
  • 7
  • 12
  • Why do you need to do this? – Sam Estep Mar 01 '16 at 21:11
  • I'm processing a bunch of different csv files. I didn't want my code littered with index numbers by putting the rows in vectors. As some of the the files are several million rows, I didn't want to incur the additional garbage collection cost by putting them into regular maps, so defrecord seemed the perfect solution. And it was for all the files up to my last one, which happened to be 126 fields wide. As my file reading code is fairly shared, wanted to see if there was a way to create larger records before searching for a different method ... – Brian Mar 01 '16 at 21:40
  • 1
    Could you use `deftype` instead? – Sam Estep Mar 01 '16 at 21:44
  • Have you tried any of Clojure's CSV libraries, such as https://github.com/davidsantiago/clojure-csv ? I have no experience with them, but they might be better than trying to roll your own. – WolfeFan Mar 01 '16 at 21:44
  • I'm using https://github.com/clojure/data.csv, which seems pretty good. – Brian Mar 01 '16 at 21:47
  • @Elogent I haven't been able to figure out how to access deftypes like maps, perhaps due to inexperience. Would need to change my code if going to use hand coded accessors, which isn't the end of the world, but seems it would lose a lot of the clojure goodness. – Brian Mar 01 '16 at 22:01
  • Perhaps defstructs are an acceptable option? I initially chose defrecords because the documentation seemed to indicate they were a better choice, but maybe this is an edge case. – Brian Mar 01 '16 at 22:16
  • I've edited my answer to show you how you can use vectors and also automatically generate the style of index you want. Please let me know what you think! – WolfeFan Mar 01 '16 at 23:15

1 Answers1

3

Java has a maximum size for its methods (see the answers to this question for specifics). defrecord creates methods whose size depends on the number of values the record will contain.

To deal with this issue, I see two options:

  1. macroexpand-1 your call to defrecord, copy the results, and find a way to re-write the generated methods to be smaller.
  2. Take a different approach to storing your data, such as using Clojure's vector class.

EDIT:

Now that I know what you want to do, I am more convinced that you should use vectors. Since you want to use indexes like a101, I've written you a macro to generate them:

(defmacro auto-index-vector [v prefix]
  (let [indices (range (count (eval v)))
        definitions (map (fn [ind]
                           `(def ~(symbol (str prefix ind)) ~ind)) indices)]
    `(do ~@definitions)))

Let's try it out!

stack-prj.bigrecord> (def v1 (into [] (range 122)))
#'stack-prj.bigrecord/v1
stack-prj.bigrecord> (auto-index-vector v1 "a")
#'stack-prj.bigrecord/a121
stack-prj.bigrecord> (v1 a101)
101
stack-prj.bigrecord> (assoc v1 a101 "hi!")
[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94
95 96 97 98 99 100 "hi!" 102 103 104 105 106 107 108 109 110 111 112
113 114 115 116 117 118 119 120 121]

To use this: you'll read your CSV data into a vector, call auto-index-vector on it with the prefix of your choosing, and use the resulting indices to perform vector operations on your data.

Community
  • 1
  • 1
WolfeFan
  • 1,447
  • 7
  • 9
  • Had seen the 64kb code limit, and had done a macroexpand on a two field defrecord before asking the question, and was stunned by its size. Am hoping others have run into this and discovered an alternative way to define records that doesn't do a code explosion! – Brian Mar 01 '16 at 21:44
  • 1
    I should have mentioned the actual field names have meaning, e.g., upc or unspsc. Will accept your answer and thanks to both you and @Elogent for the ideas. – Brian Mar 01 '16 at 23:27