1

How do you create a table where each column will left pad with zeros if the max number of characters are not entered during data entry.
I know how to pad after the data has been entered into the table but I don't know how to pad during the point of data entry.

example....

import values for a column defined as nvarchar(15)...

RTS092
3ISUEX
ABCDEFGHIJ
9485028

stored values in the table after data entry/import....

000000000RTS092
0000000003ISUEX
00000ABCDEFGHIJ
000000009485028
bummi
  • 27,123
  • 14
  • 62
  • 101
I_AM_JARROD
  • 685
  • 2
  • 7
  • 20
  • Various helpers have suggested ways to prepend zeroes while extracting data from the table. If a column has a unique constraint then values `FOO` and `0FOO` should both be treated as equal , i.e. they are both representations of `000000000000FOO`. Does this apply to your question? – HABO Jan 05 '15 at 23:32

3 Answers3

2

How do I Left Pad a value?

I follow the pattern of prepending a maximum length string to the target column and then shearing off the last N bits. In this case, '000000000000000' + 'RTS092' yields a value of 000000000000000RTS092 which is clearly too long so I apply a right substring operation to retain the rightmost characters.

I find this to be a "dumb" approach but it works well. There's very little logic to get wrong. You might be able to shave some microscopic time units by only padding when needed and by the minimal amount but maintenance of getting that logic right and keeping it right is generally outweighed by the simplicity of this approach

Applying the pad

As I see it, there are 3 approaches here: transform it in flight, fix it before it lands, translate it on access.

Transform

The most sane approach is to fix the data before it lands in the table. That's the whole T of ETL. Within SSIS, I would add a Derived Column Component into my Data Flow using the following logic

ISNULL(MyCol) ? 
    REPLICATE(@[User::PadCharacter],@PadLength) :
    RIGHT((REPLICATE(@[User::PadCharacter],@[User::PadLength]) + LTRIM(RTRIM(MyCol))),@[User::PadLength])

This is a generic approach and assumes I've created 2 SSIS level Variables: PadCharacter and PadLength. I specify a PadCharacter of 0 and a PadLength of 15. If you discover you need to use a pad character of X and length of 20, you change your value, not your formula.

It correctly handles NULL values as well as white space padding on either side.

enter image description here

Fix it before it lands

As Greg has called out, you can use an INSTEAD OF trigger to fix the data prior to it touching the table. I would only advocate this approach if I had applications I could not fix to prevent bad data from hitting the table. I'd also be benchmarking the increased insert cost and ensure my business users understand the potential impact on the system.

Translate it on access

Assuming you can't fix it in flight and the cost of a trigger is too great, you can look at create a computed column on the table. Who cares if we stored it as 'RTS092', the column applications access is defined using the same formula repeated above to always present a padded value to the users.

Code approximately

ALTER TABLE
    dbo.MyTable
ADD
    MyComputedCol AS RIGHT(( REPLICATE(N'0',15) + COALESCE(NULLIF(LTRIM(RTRIM(MyCol)), ''), '') ), 15) 

Biml

Since I'm a fan of showing my work, the following Biml will create the above package. This does use bimlscript as the string operations can cause the emitter fiddly determining length so I explicitly force it to a DT_WSTR data type with a specified length. Download Bids Helper. Add a .biml file, paste the following code but watch out for Visual Studio "helping" you when you paste

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <# int padLength=15;#>
    <Connections>
        <OleDbConnection Name="CM_OLE" ConnectionString="Data Source=localhost\dev2014;Initial Catalog=tempdb;Provider=SQLNCLI10.1;Integrated Security=SSPI;Auto Translate=False;" />
    </Connections>
    <Packages>
        <Package ConstraintMode="Linear" Name="so_27787812">
            <Variables>
                <Variable DataType="String" Name="QuerySource">
                    <![CDATA[SELECT
    D.MyCol
FROM
(
    VALUES
    (N'RTS092')
,   (N'3ISUEX')
,   (N'ABCDEFGHIJ')
,   (N'9485028')
,   (N' UGH ')
,   (NULL)
)D(MyCol);]]>
                </Variable>
                <Variable DataType="String" Name="PadCharacter">0</Variable>
                <Variable DataType="Int32" Name="PadLength"><#=padLength#></Variable>
            </Variables>
            <Tasks>
                <Dataflow Name="DFT LeftPad">
                    <Transformations>
                        <OleDbSource ConnectionName="CM_OLE" Name="OLE_SRC Query">
                            <VariableInput VariableName="User.QuerySource"></VariableInput>
                        </OleDbSource>
                        <!--
                            Left pad our column with our specified character.
                            Assumes source column is called MyCol
                            Creates a new column callec der_MyCol
                        -->
                        <DerivedColumns Name="DFT LeftPad">
                            <Columns>
                                <Column DataType="String" Name="der_MyCol" Length="<#=padLength#>">ISNULL([MyCol]) ? REPLICATE(@[User::PadCharacter], @PadLength) : RIGHT((REPLICATE(@[User::PadCharacter], @[User::PadLength]) + LTRIM(RTRIM([MyCol]))), @[User::PadLength])</Column>
                            </Columns>
                        </DerivedColumns>
                        <DerivedColumns Name="DFT Do nothing" />

                    </Transformations>
                </Dataflow>
            </Tasks>
        </Package>
    </Packages>
</Biml>
billinkc
  • 59,250
  • 9
  • 102
  • 159
0
right('000000000000000'+ rtrim(@str), 15)

where @str is the string. To put in terms of query,

SELECT COL, right(('000000000000000'+ rtrim(COL)), 15) PaddedString FROM YourTable

Source: Most efficient T-SQL way to pad a varchar on the left to a certain length?

You could have a staging table where the raw data goes and then process and move the data to a second table, if you want to achieve the padding to happen before the data enters the table. Or have the above approach applied at the source so that it gets built and then enters the target table. Here YourTable would be the source table.

Community
  • 1
  • 1
SouravA
  • 5,147
  • 2
  • 24
  • 49
0

You probably shouldn't be doing this in your database, but in the application that is submitting the data, but if you have to do it in the db I suggest that you do it in a trigger on insert using something like the following:

CREATE TRIGGER myTriggerINSERT
ON myTableName
INSTEAD OF INSERT
AS
DECLARE @myField NVARCHAR(15)
SET @myField = (SELECT myField FROM inserted)
INSERT myTableName (myField) VALUES(right('000000000000000'+ rtrim(@myField), 15))
Greg the Incredulous
  • 1,676
  • 4
  • 29
  • 42
  • 2
    Your trigger has two major flaws. First, it will not support multiple row operations. Second is that an insert will now have two rows instead of one because you didn't specify this as an INSTEAD of trigger. That being said, I don't think a trigger is needed here because they indicate this is for a data import. – Sean Lange Jan 05 '15 at 21:23
  • Nice approach but if most of the records are already 15 characters in length, it would be poor performer. Probably you should have an `IF` to check the length first. – SouravA Jan 05 '15 at 21:23
  • Well spotted @SeanLange! :) – SouravA Jan 05 '15 at 21:28
  • @SeanLange I missed the point that it was for a data import. You're right about the INSTEAD of trigger - I'll edit the answer. I would avoid doing this in the DB in any case - SSIS would be a better place if it's a data import. – Greg the Incredulous Jan 05 '15 at 21:44