57

Is there any simple way to convert .xls file to .csv file ? (Excel)

in C# code ?

I mean to take an existing .xls file and convert them to .csv file

Braiam
  • 1
  • 11
  • 47
  • 78
Gold
  • 60,526
  • 100
  • 215
  • 315
  • There is also this stackoverflow Q/A https://stackoverflow.com/questions/57702766/i-want-to-convert-a-xls-or-xlsx-file-to-csv-format-using-c-sharp – andrew pate Feb 21 '23 at 12:35

9 Answers9

33

Here's a C# method to do this. Remember to add your own error handling - this mostly assumes that things work for the sake of brevity. It's 4.0+ framework only, but that's mostly because of the optional worksheetNumber parameter. You can overload the method if you need to support earlier versions.

static void ConvertExcelToCsv(string excelFilePath, string csvOutputFile, int worksheetNumber = 1) {
   if (!File.Exists(excelFilePath)) throw new FileNotFoundException(excelFilePath);
   if (File.Exists(csvOutputFile)) throw new ArgumentException("File exists: " + csvOutputFile);

   // connection string
   var cnnStr = String.Format("Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties=\"Excel 8.0;IMEX=1;HDR=NO\"", excelFilePath);
   var cnn = new OleDbConnection(cnnStr);

   // get schema, then data
   var dt = new DataTable();
   try {
      cnn.Open();
      var schemaTable = cnn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
      if (schemaTable.Rows.Count < worksheetNumber) throw new ArgumentException("The worksheet number provided cannot be found in the spreadsheet");
      string worksheet = schemaTable.Rows[worksheetNumber - 1]["table_name"].ToString().Replace("'", "");
      string sql = String.Format("select * from [{0}]", worksheet);
      var da = new OleDbDataAdapter(sql, cnn);
      da.Fill(dt);
   }
   catch (Exception e) {
      // ???
      throw e;
   }
   finally {
      // free resources
      cnn.Close();
   }

   // write out CSV data
   using (var wtr = new StreamWriter(csvOutputFile)) {
      foreach (DataRow row in dt.Rows) {
         bool firstLine = true;
         foreach (DataColumn col in dt.Columns) {
            if (!firstLine) { wtr.Write(","); } else { firstLine = false; }
            var data = row[col.ColumnName].ToString().Replace("\"", "\"\"");
            wtr.Write(String.Format("\"{0}\"", data));
         }
         wtr.WriteLine();
      }
   }
}
B. Clay Shannon-B. Crow Raven
  • 8,547
  • 144
  • 472
  • 862
mattmc3
  • 17,595
  • 7
  • 83
  • 103
31

Checkout the .SaveAs() method in Excel object.

wbWorkbook.SaveAs("c:\yourdesiredFilename.csv", Microsoft.Office.Interop.Excel.XlFileFormat.xlCSV)

Or following:

public static void SaveAs()
{
    Microsoft.Office.Interop.Excel.Application app = new Microsoft.Office.Interop.Excel.ApplicationClass();
    Microsoft.Office.Interop.Excel.Workbook wbWorkbook = app.Workbooks.Add(Type.Missing);
    Microsoft.Office.Interop.Excel.Sheets wsSheet = wbWorkbook.Worksheets;
    Microsoft.Office.Interop.Excel.Worksheet CurSheet = (Microsoft.Office.Interop.Excel.Worksheet)wsSheet[1];

    Microsoft.Office.Interop.Excel.Range thisCell = (Microsoft.Office.Interop.Excel.Range)CurSheet.Cells[1, 1];

    thisCell.Value2 = "This is a test.";

    wbWorkbook.SaveAs(@"c:\one.xls", Microsoft.Office.Interop.Excel.XlFileFormat.xlWorkbookNormal, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlShared, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
    wbWorkbook.SaveAs(@"c:\two.csv", Microsoft.Office.Interop.Excel.XlFileFormat.xlCSVWindows, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Microsoft.Office.Interop.Excel.XlSaveAsAccessMode.xlShared, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);

    wbWorkbook.Close(false, "", true);
}
KMån
  • 9,896
  • 2
  • 31
  • 41
  • 1
    thank's for the help, but how i can convert an existing file ? – Gold Mar 29 '10 at 15:42
  • In case you were still wondering, you will need to open the existing file as a workbook first using the Excel.Application.Workbooks.Open() method and use that as the wbWorkbook parameter – markdigi Sep 08 '10 at 11:21
  • 1
    using xlSaveAsAccessMode.xlNoChange is more reliable (the call to SaveAs crashs in my case if I use xlShared). – Benlitz Aug 24 '11 at 13:46
  • 3
    I found that this answer does have a flaw in it. If you have characters in a foreign language (Chinese, etc.), you have to set the second parameter to XlUnicodeText. This allows you to preserve the characters, however you lose the CSV format. It saves it as tab delimited. There's a parameter near the end where you can set the code page, however upon scouring the internet, it seems that no one knows how to make it work. – mj_ Aug 28 '12 at 15:06
  • You should throw an app.quit() at the end of that to close the Excel instance – Bob Probst Oct 26 '16 at 15:59
  • 1
    Interop type 'ApplicationClass' cannot be embedded? Just replace 'ApplicationClass' by 'Application'. – Juan Carlos Puerto Apr 18 '18 at 09:39
19

Install these 2 packages

<packages>
  <package id="ExcelDataReader" version="3.3.0" targetFramework="net451" />
  <package id="ExcelDataReader.DataSet" version="3.3.0" targetFramework="net451" />
</packages>

Helper function

using ExcelDataReader;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ExcelToCsv
{
    public class ExcelFileHelper
    {
        public static bool SaveAsCsv(string excelFilePath, string destinationCsvFilePath)
        {

            using (var stream = new FileStream(excelFilePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
            {
                IExcelDataReader reader = null;
                if (excelFilePath.EndsWith(".xls"))
                {
                    reader = ExcelReaderFactory.CreateBinaryReader(stream);
                }
                else if (excelFilePath.EndsWith(".xlsx"))
                {
                    reader = ExcelReaderFactory.CreateOpenXmlReader(stream);
                }

                if (reader == null)
                    return false;

                var ds = reader.AsDataSet(new ExcelDataSetConfiguration()
                {
                    ConfigureDataTable = (tableReader) => new ExcelDataTableConfiguration()
                    {
                        UseHeaderRow = false
                    }
                });

                var csvContent = string.Empty;
                int row_no = 0;
                while (row_no < ds.Tables[0].Rows.Count)
                {
                    var arr = new List<string>();
                    for (int i = 0; i < ds.Tables[0].Columns.Count; i++)
                    {
                        arr.Add(ds.Tables[0].Rows[row_no][i].ToString());
                    }
                    row_no++;
                    csvContent += string.Join(",", arr) + "\n";
                }
                StreamWriter csv = new StreamWriter(destinationCsvFilePath, false);
                csv.Write(csvContent);
                csv.Close();
                return true;
            }
        }
    }
}

Usage :

var excelFilePath = Console.ReadLine();
string output = Path.ChangeExtension(excelFilePath, ".csv");
ExcelFileHelper.SaveAsCsv(excelFilePath, output);
Jignesh Variya
  • 1,869
  • 16
  • 12
  • i tried using this lib but encountered a blocker, raised here: https://github.com/ExcelDataReader/ExcelDataReader/issues/350 – Ruslan Aug 30 '18 at 14:04
  • ExcelDataReader is working great for me, however, If you are converting a large Excel file to CSV I would recommend you avoid reading the entire dataset into memory with AsDataSet and instead read the data in row by row and manage the output without storing large amounts of data in memory. A good example of how to do this is located on their wiki here https://github.com/ExcelDataReader/ExcelDataReader#how-to-use – M3SSYM4RV1N Oct 03 '18 at 20:03
  • thanks so much for this great function. i have a date in XLSX format `10/08/2018` for some reason when i convert this to CSV format, it's inserting a time as well `10/08/2018 12:00:00AM` i've attempted attacking this by setting `UseColumnDataType = false` however, that yielded the same result – Alex Gordon Nov 21 '18 at 16:50
  • 1
    Daft question - what happens if you've more than one sheet? – Richard Griffiths Mar 01 '19 at 13:11
  • With .Xls extension file I wanted to use this, it worked like a charm, for multiple sheets I will check and update. This answer would be more ideal because it doesnt have dependency on microsoft office. – Harish Patil Sep 11 '21 at 10:26
  • note some data rows may need escaped if they have commas within them. ex: "a,b" would need: arr.Add("\"" + ds.Tables[0].Rows[row_no][i].ToString() + "\""); – MTMDev Oct 12 '21 at 13:51
9

I had to come up with a hybrid solution after updating to Visual Studio 2022 and testing the most relevant answers for this to work.

First, we need to install the following Nuget packages: ExcelDataReader, ExcelDataReader.DataSet and System.Text.Encoding.CodePages

Then, for the sake of clean architecture, proceed to create a separate class within your corresponding namespace:

using ExcelDataReader;
using System.Collections.Generic;
using System.IO;
using System.Text;

namespace YourProjectNameSpace
{
    public class ExcelFileHelper
    {

        /// <summary>
        /// Converts a given XLS into CSV file format.
        /// </summary>
        public static bool SaveAsCsv(string excelFilePath, string destinationCsvFilePath)
        {

            Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);

            using (var stream = new FileStream(excelFilePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
            {
                IExcelDataReader reader = null;
                if (excelFilePath.EndsWith(".xls"))
                {
                    reader = ExcelReaderFactory.CreateBinaryReader(stream);
                }
                else if (excelFilePath.EndsWith(".xlsx"))
                {
                    reader = ExcelReaderFactory.CreateOpenXmlReader(stream);
                }

                if (reader == null)
                    return false;

                var ds = reader.AsDataSet(new ExcelDataSetConfiguration()
                {
                    ConfigureDataTable = (tableReader) => new ExcelDataTableConfiguration()
                    {
                        UseHeaderRow = false
                    }
                });

                var csvContent = string.Empty;
                int row_no = 0;
                while (row_no < ds.Tables[0].Rows.Count)
                {
                    var arr = new List<string>();
                    for (int i = 0; i < ds.Tables[0].Columns.Count; i++)
                    {
                        arr.Add(ds.Tables[0].Rows[row_no][i].ToString());
                    }
                    row_no++;
                    csvContent += string.Join(",", arr) + "\n";
                }
                StreamWriter csv = new StreamWriter(destinationCsvFilePath, false);
                csv.Write(csvContent);
                csv.Close();
                return true;
            }
        }
    }
}

Notice I had to include this line at the beginning of the function:

Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);

If the line from above is omitted, you could end up getting the following error:

NotSupportedException: No data is available for encoding 1252

So, make sure of using it for better compatibility.

Finally, use example:

var execPath = Path.GetDirectoryName(System.Reflection.Assembly.GetExecutingAssembly().GetName().CodeBase)?.Replace("file:\\", "");
    
string FileNameXLS = "\\file.xls";
string FileNameCSV = "\\file.csv";
        
Console.WriteLine("Exporting file to CSV...." + "\n");
ExcelFileHelper.SaveAsCsv(execPath + FileNameXLS, execPath + FileNameCSV);
Console.WriteLine("File exported to CSV!" + "\n");
Joseph L.
  • 431
  • 6
  • 7
  • 1
    This worked perfectly! When replacing the comma with a semicolon in the string.Join() this produces exactly the same result as using Save As CSV UTF-8 (Comma-delimited) in Excel. – Martin H Dec 03 '21 at 17:38
  • This worked great as is for my project, thank you – Barrassment Jan 06 '22 at 15:01
  • 1
    Thank you. I was able to easily modify to use a pipe as the delimiter and to skip leading rows. `SaveAsCsv(string excelFilePath, string destinationCsvFilePath, int startRow = 0, string delimiter = ",")` – CJ Edgerton Feb 21 '22 at 20:58
3

I need to do the same thing. I ended up with something similar to Kman

       static void ExcelToCSVCoversion(string sourceFile,  string targetFile)
    {
        Application rawData = new Application();

        try
        {
            Workbook workbook = rawData.Workbooks.Open(sourceFile);
            Worksheet ws = (Worksheet) workbook.Sheets[1];
            ws.SaveAs(targetFile, XlFileFormat.xlCSV);
            Marshal.ReleaseComObject(ws);
        }

        finally
        {
            rawData.DisplayAlerts = false;
            rawData.Quit();
            Marshal.ReleaseComObject(rawData);
        }


        Console.WriteLine();
        Console.WriteLine($"The excel file {sourceFile} has been converted into {targetFile} (CSV format).");
        Console.WriteLine();
    }

If there are multiple sheets this is lost in the conversion but you could loop over the number of sheets and save each one as csv.

nate_weldon
  • 2,289
  • 1
  • 26
  • 32
3

This is a modification of nate_weldon's answer with a few improvements:

  • More robust releasing of Excel objects
  • Set application.DisplayAlerts = false; before attempting to save to hide prompts

Also note that the application.Workbooks.Open and ws.SaveAs methods expect sourceFilePath and targetFilePath to be full paths (ie. directory path + filename)

private static void SaveAs(string sourceFilePath, string targetFilePath)
{
    Application application = null;
    Workbook wb = null;
    Worksheet ws = null;

    try
    {
        application = new Application();
        application.DisplayAlerts = false;
        wb = application.Workbooks.Open(sourceFilePath);
        ws = (Worksheet)wb.Sheets[1];
        ws.SaveAs(targetFilePath, XlFileFormat.xlCSV);
    }
    catch (Exception e)
    {
        // Handle exception
    }
    finally
    {
        if (application != null) application.Quit();
        if (ws != null) Marshal.ReleaseComObject(ws);
        if (wb != null) Marshal.ReleaseComObject(wb);
        if (application != null) Marshal.ReleaseComObject(application);
    }
}
datchung
  • 3,778
  • 1
  • 28
  • 29
2

I integrate the @mattmc3 aswer. If you want to convert a xlsx file you should use this connection string (the string provided by matt works for xls formats, not xlsx):

var cnnStr = String.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"Excel 12.0;IMEX=1;HDR=NO\"", excelFilePath);
Erica B.
  • 31
  • 2
1

I had a similar task issue. Convert an inbound transactions like xlsx to tab delimited for automated processing by an existing system. Needs to run unattended. After reviewing many solutions on several different sites, and trying two of them, using MS Office Excel for C#, like above, and running into issues with different versions of MS Office, and older versions possibly on the PC, and having no control over that. I ended up going with... Aspose.Cells through NuGet. Solution is four lines of code.

string soureFilePath = "my/source/path/file.xlsx";
string targetFilePath = "my/output/path/file.txt"; 
var book = new Workbook(soureFilePath);
book.Save(targetFilePath, SaveFormat.Tsv);

It only converts sheet 1, and ignored sheets 2 and 3, but that is Ok for my use. I'm guessing it does have functionality to convert all of them if needed, I just did not need that, so did not look into it.

Their web site, incase folks want to view their information or license agreement (no cost use).

Manipulate Spreadsheets Easily | Aspose.Cells Universal Library https://products.aspose.com/cells

Note: I do not work for Aspose, I am not affiliated with Aspose, and I am not profiting in any way from this posting.

Ihor Konovalenko
  • 1,298
  • 2
  • 16
  • 21
DigDug
  • 25
  • 8
0

I maintain some libraries that make the Excel to CSV conversion about as simple as it can be: Sylvan.Data.Excel and Sylvan.Data.Csv. Sylvan.Data.Excel can be used to read .xlsx, .xlsb, and .xls files. However, it can only write .xlsx files.

Here is a minimal example of converting an Excel spreadsheet to CSV:

using Sylvan.Data.Csv;
using Sylvan.Data.Excel;

using var reader = ExcelDataReader.Create("MyData.xlsx");
using var csvWriter = CsvDataWriter.Create("MyData.csv");
csvWriter.Write(reader);

These libraries have no external dependencies, other than the .NET runtime libraries; they don't require Excel be installed. They target the latest supported .NET versions and can run cross-platform. They are also the fastest libraries in the .NET ecosystem. They are MIT licensed, so can be freely used.

MarkPflug
  • 28,292
  • 8
  • 46
  • 54