0

When I am importing data as text from excel, it is importing as scientific notation to SQL, even when I can see at all steps during the process the data appears as text without scientific notation. Here is a snippet of the original file:

Excel Data

Utilizing the ExcelDataReader 3rd party code (https://github.com/ExcelDataReader/ExcelDataReader), I am able to extract a DataTable from the excel file and read it. When reading the data, the column and values show up as I suspect from a scratch console app.

Sample Output

When I run the code responsible for sending to SQL, I can see in the SQL profiler the SQL it is executing. I can copy it and execute it separately, which then imports the data as I would expect. Here is a sample.

declare @p1 dbo.ImportDataTableTest
insert into @p1 values(1623157535)
insert into @p1 values(1690205652)
insert into @p1 values(1690205654)
insert into @p1 values(1623110539)
insert into @p1 values(1623165689)
insert into @p1 values(1623165689)
insert into @p1 values(1623171209)
insert into @p1 values(1623128536)
insert into @p1 values(1623172096)

exec InsertStagingData @ImportDataTable=@p1

My problem shows up here after running the application as normal. This is the result in the table:

InvNo           StagingID
3.22029e+009    6491
3.22028e+009    6492
3.22028e+009    6493
3.22028e+009    6494
3.22029e+009    6495

Here is the scratch app and sql code I am using for this example:

C#

namespace ConsoleApp1
{

    class Program
    {

            static void Main(string[] args)
            {
                string fileName = @"path to the file";

                ExcelDataWrapper wrapper = new ExcelDataWrapper();

                ExcelFile file = wrapper.GetExcelData(fileName);

                DataTable dt = file.ExcelData.Tables[0];

                using (var conn = new SqlConnection("our connection string"))
                using (var command = new SqlCommand("InsertStagingData", conn)
                {
                    CommandType = CommandType.StoredProcedure
                })
                {
                    conn.Open();
                    //This parameter name would be passed in from our steps table and would allow 
                    //for different values to be imported per database
                    command.Parameters.Add(new SqlParameter("@ImportDataTable", dt));
                    command.ExecuteNonQuery();
                }         
            }
    }

}

namespace ExcelDataReaderWrapper
{
    public class ExcelDataWrapper
    {
        public ExcelFile GetExcelData(string pathToFile)
        {

            DataSet ds;

            ds = Read(pathToFile);

            string fileName = Path.GetFileName(pathToFile);

            ExcelFile dataTableWithFileName = new ExcelFile { FileName = fileName, ExcelData = ds };

            return dataTableWithFileName;

        }

        private DataSet Read(string file)
        {
            System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);
            using (var stream = File.Open(file, FileMode.Open, FileAccess.Read))
            {
                if (file.EndsWith(".xls"))
                {
                    using (IExcelDataReader reader = ExcelReaderFactory.CreateBinaryReader(stream))
                    {
                        var conf = new ExcelDataSetConfiguration
                        {
                            ConfigureDataTable = _ => new ExcelDataTableConfiguration
                            {
                                UseHeaderRow = true
                            }
                        };

                        return reader.AsDataSet(conf);
                    }
                }
                if (file.EndsWith(".xlsx"))
                {
                    using (IExcelDataReader reader = ExcelReaderFactory.CreateOpenXmlReader(stream))
                    {
                        var conf = new ExcelDataSetConfiguration
                        {
                            ConfigureDataTable = _ => new ExcelDataTableConfiguration
                            {
                                UseHeaderRow = true
                            }
                        };

                        return reader.AsDataSet(conf);
                    }
                }
                return null;
            }
        }   
    }
}

SQL:

/****** Object:  UserDefinedTableType [dbo].[ImportDataTableTest]    Script Date: 11/19/2019 12:18:18 PM ******/
CREATE TYPE [dbo].[ImportDataTableTest] AS TABLE(
    [InvNo] [nvarchar](255) NULL
)
GO

CREATE Procedure [dbo].[InsertStagingData]
    @ImportDataTable ImportDataTableTest readonly
AS
BEGIN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;

    Insert into StagingPOS([InvNo])
    Select InvNo
    from  @ImportDataTable


END
GO

Is there some solution I am missing that gets the data as I see it from the application and the profiler to the table correctly as a nvarchar field? I tried casting the values, but the import looses some of the precision that I need.

EDIT 1:

I followed the link provided by @pcalkins. This would appear to solve the problem as you can see here:

    ALTER Procedure [dbo].[InsertStagingData]
    @ImportDataTable ImportDataTableTest readonly
AS
BEGIN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;

    Insert into StagingPOS([InvNo])
    select  Case When InvNo like '%e+%' or InvNo like '%e-%' then replace(cast(cast(cast([InvNo] as real) as money) as nvarchar(255)),'.00','')
    Else InvNo
    End
    from  @ImportDataTable


END

InvNo       StagingID
3220290048  8435
3220280064  8436
3220280064  8437
3220280064  8438

But when I query the values in my excel file to the results in the table, nothing comes up. The values appear rounded at different points in the number.

zep426
  • 179
  • 9
  • I recommend simplifying your question. Its very long and detailed as it stands which is going to put people off reading and replying. Its recommended to have a "minimal", "reproducible" example i.e. the smallest amount of information required to understand the problem. All the background information in fact only making its more confusing for readers. – Dale K Nov 19 '19 at 19:41
  • I believe the value is saved as the scientific notation in Excel if it's a certain number of places. Not sure why it isn't casting it correctly, though... it should. I think many of the "views" here are showing the number...(they're converting it themselves which is confusing) Maybe try this: https://social.msdn.microsoft.com/Forums/sqlserver/en-US/9753eadd-5677-4ed9-8942-b3528785253b/convert-from-scientific-notation?forum=transactsql – pcalkins Nov 19 '19 at 20:06
  • @DaleK your right, edited above for some clarity. – zep426 Nov 19 '19 at 20:07
  • looks like it's any value past 11 characters...(if set to non-string type in Excel) – pcalkins Nov 19 '19 at 20:23
  • Looks like I was wrong... the number is saved as the number because the scientific notation is a shortened version of the number. It's definitely short-hand notation and not the actual value... it just seems to trucate and add zeroes for places. (?) – pcalkins Nov 19 '19 at 20:33
  • Further testing and I see data loss occuring in Excel's conversions, so maybe it's true that it saves only the truncated value when the column's data type is set a certain way. Very strange. Sometimes you can see both in Excel, (scientifc and full number when double-clicking on it), but if you save it and re-open, you'll see the value itself is losing precision and being padded with zeroes! – pcalkins Nov 19 '19 at 20:44
  • so in short, this is garbage in, garbage out... a clue will be that ExcelDataReader imports scientific notation in the first place. That means that the excel file itself is borked. – pcalkins Nov 19 '19 at 20:51
  • @pcalkins yeah, I can see collapsing the column it turns into scientific notation all of a sudden. But what confuses me is both when I output the data to console, and grap it from the profiler, it appears as it should. – zep426 Nov 19 '19 at 20:57
  • that is odd... are you sure you're using the same file each time? Also when you debug after getting the datatable, what do you see as the values in "dt"? Are they correct there? – pcalkins Nov 19 '19 at 21:18
  • @pcalkins I have my test code point to the same file each time I run, so I know were good there. Additionally, I can go into the debugger and see in the DataTable Visualizer that the data is correct here too. – zep426 Nov 19 '19 at 21:38
  • For any value over 14 characters... excel seems like it's losing information, but it starts displaying the notation at 11 or so... and those are still imported and saved correctly (at least by NPOI), but the ones above 14 are showing up as scientific notation and showing in Excel as truncated with zeroes where there used to be numbers. Your numbers are smaller... but maybe something's going on where an int to long or long to int is causing loss? Int Max is "2,147,483,647" ... some of your numbers are above that. – pcalkins Nov 19 '19 at 22:00
  • I found something about the Excel quirk here: http://excelunplugged.com/2014/05/19/15-digit-limit-in-excel/ Don't think that explains your problem though, since the numbers are showing correctly in the datatable and you're not at 15 places. – pcalkins Nov 19 '19 at 22:11
  • it's got to be some kind of conversion of types... maybe try the solution here to force the datatable to string: https://stackoverflow.com/questions/57288816/how-to-convert-iexceldatareader-values-to-string-datatype – pcalkins Nov 19 '19 at 22:31
  • There are many established ways to read simple tabular data from Excel into SQL. What are your reasons for re-inventing the wheel with this 3rd party library? – Alex Nov 21 '19 at 01:13

0 Answers0