I am currently working on aligning text data, mostly hidden in CSV or Excel files from multiple sources. I've done this easily enough with python (even on a Raspberry Pi) and Openoffice. The issues are:
- transforming disparate names to unique names (easy)
- storing the data in CSV or Excel files (because my collabs use Excel)
- Eventually building a real DB (SQL based- MariaDB, Postgres) from the Excel files
- Doing statistics on the data; mostly enumeration from different CSV files and comparison between samples - nice to generate graphs
- for debugging purposes it would be nice to quickly generate bar charts and such of groups of the data
Nothing superfancy, except it gets slow in python (no doubt generously helped by my "I am not a programmer" 'code' . The data sets will get 'large' (10's of thousands of lines of data times multiple dozens data sets). I would like a programming tool which facilitates this.
I looked into Ch (& cling, cint) because I still remember a bit of C, interpreted, but Ch seems to offer a good set of libs. Python is ok for much of it, but I dislike the syntax. I try to work on Linux as much as I can, but eventually I have to hand it off to Windows users in a country not known for having fast computers. I was looking at ceemple (ceemple.com) and was wondering if anyone has used that for a project and what their experience has been. Does it help with cross platform issues (e.g., line termination)? Should I just forget Linux (with that wonderful shell and easy python and text editors which can load large files w/o bogging down) and move it to Windows? If so, then compiled is just about the only way to go for me, likely precluding Ch and probably python. Please keep in mind that this is my 'side job' - I'm not a professional programmer. Low learning curve and least amount of tools required is important.