ORIGINAL QUERY: Currently i wanted to switch from 32 bit builds to 64 bit, but i have trouble with finiding any library to extract data from xls. I can work with others like (xlsx, xlsm) on ClosedXML, unfortunately xls is not supported.
so right now this connexction will not stand.
string connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + path + ";Extended Properties='Excel 12.0;HDR=NO,IMEX=1'";
string excelQuery = "SELECT * FROM [monthly$A:G]";
Installing 64 bit drivers for OLE is out of a question. So maybe someone has stumble upon it, or have any ideas.
I'd like to build data into datatable with headers similar to HDR=YES
and possibility to switch it off like in excample above.
Additionaly, due to excel formats, dapper is out of the question :(
i Tried to find some questions but all I could find would be to install OLEDB Driver for 64 bits.
EDIT:
I don't understand why someones is donevoting Panagiotis answer, as it's corrected, thats why i accepted it. He's suggestion to use ExcelDataReader
proved to be most efficient as im using more formats of excel like .xlsx
, .xlsm
& .xlsb
in application, additionaly there are some .csv
files.
This allowed me to build a module, that was sufficient.
He was additionaly right to point out that .xls
is ancient, although unfortunately it's not obsolete as a lot of processes i corporation still rely on it. But it should be osbsolete, we are unfortunately stuck due to poorly planned processes not because .xls
is useful. Especially those companies should look at the fact that some formats are not supported, or might lose this support as .xls
is slowly getting that treatment.
ANOTHER EDIT: Unfortunately I just noticed, and the noob i am I didn't know that nugets could be paid. Aspose, as great it is unfotunately is one of those paid extensions. So i had to change the code and add another converter class. To do so I used & modified code used in another [thread ][1] to my needs.
LAST EDIT:
Ok, after update to ExcelDataReader
I wasn't pleased of performance, and converting a file, didn't seemed right to me, as i noticed that @MarkPflug had updated [Sylvan.Data.Excel][2] library, and impressed by benchmarks i wrote code based on that. There were some expetions but after contacting and delivering excamples to Mark, he quickly fixed those. Right now it's fastest and best solution i can think of.
I will update it for my use to extract asynchronously from number of files selected. But based on that, anyone can use it.
**FINAL CODE:**
using SylExcel = Sylvan.Data.Excel;
using SylCSV = Sylvan.Data.Csv;
using SylData = Sylvan.Data;
/// <summary>
/// get data from .csv, .xls, .xlsx, .xlsm, .xlsb. Call is then redirected accordingly
/// </summary>
/// <param name="_filePath">path to file needed</param>
/// <param name="_sheetName">sheet name to extract</param>
/// <param name="_structure">structure of needed file</param>
public void GetData(string _filePath, string? _sheetName, FilesStructures.Structure _structure)
{
//if (CheckIfOpened(_filePath))
//{
// throw new OperationCanceledException(_filePath + " is opened, please close it.");
//}
try
{
string extension = Path.GetExtension(_filePath);
switch (extension)
{
case ".csv":
SylvanReaderCSV(_filePath, _structure);
break;
default:
SylvanReader(_filePath, _sheetName, _structure);
break;
}
//collect file information
GetMetadata(_filePath);
}
catch (Exception ex)
{
throw new Exception(ex.Message);
}
}
/// <summary>
/// sylvan reader for .xls, .xlsx, .xlsm, .xlsb
/// </summary>
/// <param name="_filePath">path to file needed</param>
/// <param name="_sheetName">sheet name to extract</param>
/// <param name="_structure">structure of needed file</param>
/// <exception cref="Exception"></exception>
private void SylvanReader(string _filePath, string? _sheetName, FilesStructures.Structure _structure)
{
try
{
string headerSchema = "";
//get unique column list to test for duplicates
List<string> columnList = new List<string>();
//build headers
//_structure.HeaderList = List<strings> passed by object _structure
foreach (var header in _structure.HeaderList)
{
if (!columnList.Exists(column => column == header))
{
headerSchema += header + ", ";
columnList.Add(header);
}
}
//remove last 2 chars
headerSchema = headerSchema.Substring(0, headerSchema.Length - 2);
var options = new SylExcel.ExcelDataReaderOptions { Schema = SylExcel.ExcelSchema.NoHeaders, GetErrorAsNull = true };
//collect data
using (SylExcel.ExcelDataReader excelDataReader = SylExcel.ExcelDataReader.Create(_filePath, options))
{
//loop to locate sheet
while (excelDataReader.WorksheetName != _sheetName)
{
excelDataReader.NextResult();
if (excelDataReader.WorksheetName == null)
{
throw new Exception("didnt find the sheet");
}
}
//loop to find headers
for(int i = 0; i < _structure.StartRow; i++)
{
excelDataReader.Read();
}
// parse the schema, and use it to reinitialize the schema for the sheet.
var schema = SylData.Schema.Parse(headerSchema);
excelDataReader.InitializeSchema(schema.GetColumnSchema(), useHeaders: true);
DataTable = new DataTable();
DataTable.Load(excelDataReader);
}
}
catch(Exception ex)
{
throw new Exception(ex.Message);
}
}
/// <summary>
/// sylvan reader for .csv
/// </summary>
/// <param name="_filePath">path to file needed</param>
/// <param name="_sheetName">sheet name to extract</param>
/// <param name="_structure">structure of needed file</param>
/// <exception cref="Exception"></exception>
private void SylvanReaderCSV(string _filePath, FilesStructures.Structure _structure)
{
try
{
string headerSchema = "";
//get unique column list to test for duplicates
List<string> columnList = new List<string>();
//build headers
//_structure.HeaderList = List<strings> passed by object _structure
foreach (var header in _structure.HeaderList)
{
if (!columnList.Exists(column => column == header))
{
headerSchema += header + ", ";
columnList.Add(header);
}
}
//remove last 2 chars
headerSchema = headerSchema.Substring(0, headerSchema.Length - 2);
var schema = new SylCSV.CsvSchema(SylData.Schema.Parse(headerSchema));
var options = new SylCSV.CsvDataReaderOptions //check if get error as null possible
{
Schema = schema,
};
using (SylCSV.CsvDataReader csvDataReader = SylCSV.CsvDataReader.Create(_filePath, options))
{
DataTable = new DataTable();
DataTable.Load(csvDataReader);
}
}
catch (Exception ex)
{
throw new Exception(ex.Message);
}
}
[1]: https://stackoverflow.com/questions/28398071/reading-excel-xlsb-files-in-c-sharp
[2]: https://github.com/MarkPflug/Sylvan.Data.Excel