Since SQLcl is a pretty cool tool for everyone having to deal with Oracle databases, I thought it was a good idea to start using it for a couple of things, for example exporting data from one database to a CSV and import it in a different database via the SQLcl LOAD command.

It all works pretty nicely when you are working in a database and OS with english or american locale settings. However, that’s not the case for me, since I’m from Germany, and so I ran into a couple of problems that I want to investigate (and provide a workaround) in this blog post.

(This is all done with SQLcl 21.2, but I expect the behaviour to change in future versions)

Preparations

I created a complete script that we can use to investigate what happens when exporting and loading data.

The script is divided into some preparation and cleanup stuff and then 2 Steps: Export to CSV and Import using LOAD.

/* Prepare test-tables */
set feedback off
create table force_measures (
    id integer generated by default on null as identity primary key not null,
    character varchar2(256),
    force_power number(32,2),
    measure_date date
);

create table force_measures2 (
    id integer generated by default on null as identity primary key not null,
    character varchar2(256),
    force_power number(32,2),
    measure_date date
);

insert into force_measures ( character, force_power, measure_date ) 
  values ( 'Han Solo', 12.5, date '2021-05-29' );
insert into force_measures ( character, force_power, measure_date ) 
  values ( 'Anakin Skywalker', 100500.33, date '2020-09-05' );
insert into force_measures ( character, force_power, measure_date ) 
  values ( 'Mace Windu', 7899, date '1984-03-23' );

commit;

set feedback on

/* Step 1: Export */
spool sqlcl_load_numeric.csv
select /*csv*/ * from force_measures;
spool off

/* Step 2: Import */
load force_measures2 sqlcl_load_numeric.csv

select * from force_measures2;

/* Cleanup: */
set feedback off
drop table force_measures;
drop table force_measures2;

exit

When I run it like that without any modifications, I will get some ugly chars (Umlaute), so the first thing I want to change is setting my CMD/Powershell and also SQLcl to UTF-8.

For powershell, I can use the chcp command, setting it to codepage 65001 (UTF-8).

For SQLcl, I can use the JVM file.encoding argument and pass it to SQLcl by writing it to the JAVA_TOOL_OPTIONS environment variable:

chcp 65001
$env:java_tool_options="-Dfile.encoding=UTF8"

Problem 1: Decimal-Export with German Numeric Characters

When I run my script without changing anything (being in the German locale), SQLcl LOAD will report something like this:

--Einfügen für Zeile  1  nicht erfolgreich
--MEASURE_DATE: GDK-05040: Eingabewert nicht lang genug für Datumsformat.
--Daten der Zeile 1 folgen:
INSERT INTO FORCE_MEASURES2(ID,CHARACTER,FORCE_POWER,MEASURE_DATE)
VALUES (1.0,'Han Solo',12.0,to_date('5'));

When we look at the created CSV-file, we can immediately see the problem, because due to the German Numeric characters (, as decimal delimiter), we end up not having a valid CSV file:

"ID","CHARACTER","FORCE_POWER","MEASURE_DATE"
1,"Han Solo",12,5,29.05.21
2,"Anakin Skywalker",100500,33,05.09.20
3,"Mace Windu",7899,23.03.84

The number 12,5 will be interpreted as two separate fields, because the SQLcl /*csv*/ output will not surround numbers with ".

We can change that by setting the NLS_NUMERIC_CHARACTERS to .,, so 12,5 will be represented as 12.5:

/* Step 1: Export */
alter session set nls_numeric_characters = '.,';

spool sqlcl_load_numeric.csv
select /*csv*/ * from force_measures;
spool off
"ID","CHARACTER","FORCE_POWER","MEASURE_DATE"
1,"Han Solo",12.5,29.05.21
2,"Anakin Skywalker",100500.33,05.09.20
3,"Mace Windu",7899,23.03.84

Way better! And we don’t get an error anymore when loading the CSV.

Problem 2: Decimal-Import with German locale

But when we have a look at the final content of force_measures, we can see another problem:

   ID           CHARACTER    FORCE_POWER    MEASURE_DATE
_____ ___________________ ______________ _______________
    1 Han Solo                       125 29.05.21
    2 Anakin Skywalker          10050033 05.09.20
    3 Mace Windu                    7899 23.03.84

So, the 12.5 force-power of Han Solo turned to 125 – that’s not correct, he isn’t that strong with the force!

Notice that we already set NLS_NUMERIC_CHARACTERS correctly, but SQLcl doesn’t pick it up, still trying to use the German numeric characters (where . is just the thousands delimiter).

The reason for this could be, that the LOAD command is not part of the database, but an addition of SQLcl, which is written in Java. Java has its own way to deal with locales.

Fortunately, there’s a command in SQLcl to control the locale:

/* Step 2: Import */
set load locale American America
load force_measures2 sqlcl_load_numeric.csv

With that change, I get the correct results:

   ID           CHARACTER    FORCE_POWER    MEASURE_DATE
_____ ___________________ ______________ _______________
    1 Han Solo                      12.5 29.05.21
    2 Anakin Skywalker         100500.33 05.09.20
    3 Mace Windu                    7899 23.03.84

Strangeness 1 – When we set any other LOAD locale

One strange thing is that we can also set the LOAD locale to “German/Germany” – it still works:

/* Step 2: Import */
set load locale German Germany
load force_measures2 sqlcl_load_numeric.csv
   ID           CHARACTER    FORCE_POWER    MEASURE_DATE
_____ ___________________ ______________ _______________
    1 Han Solo                      12.5 29.05.21
    2 Anakin Skywalker         100500.33 05.09.20
    3 Mace Windu                    7899 23.03.84

We can even do something like that – and it still works:

/* Step 2: Import */
set load locale Wookie Kashyyk
load force_measures2 sqlcl_load_numeric.csv

Strange.

Strangeness 2 – When Changing the Java Locale

But the really strange thing about that solution is, that it works at all. Because when we set the locale to American/America, this should also influence the date format and we should in fact get an error.

We can simulate this by changing the Java Locale to en_US:

 $env:java_tool_options="-Dfile.encoding=UTF8 -Duser.language=en -Duser.country=US"

We also need to stop exporting the CSV-file, because otherwise the CSV will be written with American-locale, too. But I previous solution did not change the the overall locale, it did only change the locale for the LOAD command, using the same CSV file with dates written like 29.05.21

/* Step 1: Export */
alter session set nls_numeric_characters = '.,';

--spool sqlcl_load_numeric.csv
--select /*csv*/ * from force_measures;
--spool off

/* Step 2: Import */
set load locale American America
load force_measures2 sqlcl_load_numeric.csv

When we run the script again, we get a date conversion error – as we would expect.

#ERROR Insert failed for row  1
#ERROR MEASURE_DATE: GDK-05043: not a valid month
#ERROR Row 1 data follows:
1,Han Solo,12.5,29.05.21

What we would need to do is to export our data with the exact same locale, so our CSV-data looks like this:

"ID","CHARACTER","FORCE_POWER","MEASURE_DATE"
1,"Han Solo",12.5,29-MAY-21
2,"Anakin Skywalker",100500.33,05-SEP-20
3,"Mace Windu",7899,23-MAR-84

With that input-data, everything works fine.

And we can even get completely rid of both, setting the LOAD locale and altering NLS_NUMERIC_CHARACTERS – because we are now completely in the American/America world.

But what if we HAVE English input, but German locale?

Given what we found out so far, I would recommend to always setting the Java-locale to the same as the input file you want to import and not try to use the LOAD locale and NLS_NUMERIC_CHARACTERS. But sometimes there are circumstances, so let’s have a look what we can do when we have English input but want to import it into a German locale.

Our input looks like this:

"ID","CHARACTER","FORCE_POWER","MEASURE_DATE"
1,"Han Solo",12.5,29-MAY-21
2,"Anakin Skywalker",100500.33,05-SEP-20
3,"Mace Windu",7899,23-MAR-84

As expected, setting the LOAD locale does not help:

/* Step 2: Import */
set load locale American America
load force_measures2 sqlcl_load_numeric.csv
#ERROR Einfügen für Zeile  1  nicht erfolgreich
#ERROR MEASURE_DATE: GDK-05058: Nicht numerisches Zeichen wurde gefunden
#ERROR Daten der Zeile 1 folgen:
1,Han Solo,12.5,29-MAY-21

So let’s alter the NLS_DATE_FORMAT then

/* Step 2: Import */
set load locale American America
alter session set nls_date_format = 'DD-MON-YY';
load force_measures2 sqlcl_load_numeric.csv

Better, but still doesn’t work (because “MAY” is not a known German month)

#ERROR Einfügen für Zeile  1  nicht erfolgreich
#ERROR MEASURE_DATE: GDK-05043: Kein gültiger Monat
#ERROR Daten der Zeile 1 folgen:
1,Han Solo,12.5,29-MAY-21

So we really need to change NLS_LANG, too:

/* Step 2: Import */
set load locale American America
alter session set nls_date_format = 'DD-MON-YY';
alter session set nls_language = 'American';
load force_measures2 sqlcl_load_numeric.csv
   ID           CHARACTER    FORCE_POWER    MEASURE_DATE
_____ ___________________ ______________ _______________
    1 Han Solo                      12,5 29-MAY-21
    2 Anakin Skywalker         100500,33 05-SEP-20
    3 Mace Windu                    7899 23-MAR-84

Final Strangeness

Now there’s a final strangeness:

Since we’re controlling NLS_DATE_FORMAT, NLS_LANGUAGE and NLS_NUMERIC_CHARACTERS – can we get rid of LOAD locale?

/* Step 2: Import */
alter session set nls_numeric_characters = '.,';
alter session set nls_date_format = 'DD-MON-YY';
alter session set nls_language = 'American';
load force_measures2 sqlcl_load_numeric.csv

Nope, we can’t, because now Han Solo is a lot more powerful again:

   ID           CHARACTER    FORCE_POWER    MEASURE_DATE
_____ ___________________ ______________ _______________
    1 Han Solo                       125 29-MAY-21
    2 Anakin Skywalker          10050033 05-SEP-20
    3 Mace Windu                    7899 23-MAR-84

Happy Loading 🙂

TL;DR

Quick solution: Set the Java-locale for both, exporting and importing to en_EN (or en_US) using the JAVA_TOOL_OPTIONS environment table:

$env:java_tool_options="-Dfile.encoding=UTF8 -Duser.language=en -Duser.country=US"

If you don’t want or can’t, it gets a bit complicated.

Categories: SQL

0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.