Batch (or other?) modification of a CSV column
brucine
Posted messages
24489
Registration date
Status
Membre
Last intervention
-
brucine Posted messages 24489 Registration date Status Membre Last intervention -
brucine Posted messages 24489 Registration date Status Membre Last intervention -
Hello,
I was given a complicated baby to take care of: a CSV file is extracted from a database (of which I know nothing) via a VBS script.
Sidebar question, the CSV file outputs accented characters (I don't know if they are correct in the database) in a hieroglyphic way; is there a simple way in VBS, which I don't understand, like in Batch, to remedy this?
The date is presented, column 2, in the format
YYYY-MM-DD HH:MM.SSS
although the VBS date function uses a different format, but probably just retrieves the system date in long format:
Aside from the accents, the goal of the maneuver is to obtain a date format
YYYYMMDDHHMM
On a sample file where I take only 2 columns, no problem, except that it would be simpler to redirect the modified column 2 to an external file and then replace it in the original file, but I don't know how to do it:
Where it gets tricky is that the file to be processed contains not only a large number of lines including the first line of titles that is not to be processed (which doesn't matter) but also 70 columns tabulated by pipe ""|".
We can always consider it, it's not very elegant, to write 70 variables Row(n), but the catch is that Tokens only accept 31 and the FOR condition 26 variables.
We read here and there that there are tricks both to increase the number of Tokens, for example:
https://stackoverflow.com/questions/980791/number-of-tokens-limit-in-a-for-command-in-a-windows-batch-script
or variables:
https://stackoverflow.com/questions/22867421/how-to-assign-more-than-26-variables-with-for-loop?noredirect=1&lq=1
by nesting FOR commands, but the implementation is not clear in my mind.
Thank you.
I was given a complicated baby to take care of: a CSV file is extracted from a database (of which I know nothing) via a VBS script.
Sidebar question, the CSV file outputs accented characters (I don't know if they are correct in the database) in a hieroglyphic way; is there a simple way in VBS, which I don't understand, like in Batch, to remedy this?
The date is presented, column 2, in the format
YYYY-MM-DD HH:MM.SSS
although the VBS date function uses a different format, but probably just retrieves the system date in long format:
dateNow = now()
now = sprintf("{0:yyyyMMddHHmmss}", Array(dateNow))
Aside from the accents, the goal of the maneuver is to obtain a date format
YYYYMMDDHHMM
On a sample file where I take only 2 columns, no problem, except that it would be simpler to redirect the modified column 2 to an external file and then replace it in the original file, but I don't know how to do it:
@ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION
(
FOR /f "tokens=1-2 usebackqdelims=|" %%a IN ("input.csv") DO (
SET "Row2=%%b"
SET "Row2=!Row2:-=!"
SET "Row2=!Row2: =!"
SET "Row2=!Row2::=!"
SET "Row2=!Row2:.=!"
SET "Row2=!Row2:~0,12!
SET "Row1=%%a"
ECHO "%%~a"^|!Row2!>>output.csv
)
)
GOTO :EOF
Where it gets tricky is that the file to be processed contains not only a large number of lines including the first line of titles that is not to be processed (which doesn't matter) but also 70 columns tabulated by pipe ""|".
We can always consider it, it's not very elegant, to write 70 variables Row(n), but the catch is that Tokens only accept 31 and the FOR condition 26 variables.
We read here and there that there are tricks both to increase the number of Tokens, for example:
https://stackoverflow.com/questions/980791/number-of-tokens-limit-in-a-for-command-in-a-windows-batch-script
or variables:
https://stackoverflow.com/questions/22867421/how-to-assign-more-than-26-variables-with-for-loop?noredirect=1&lq=1
by nesting FOR commands, but the implementation is not clear in my mind.
Thank you.
2 réponses
yg_be
Posted messages
23437
Registration date
Status
Contributeur
Last intervention
Ambassadeur
1 588
Hello,
I didn't understand everything, but in VBS, you could use split(), which creates an array from a string based on a separator passed as a parameter.
VBS is much more "rich" than a batch file, it allows you to do everything you can do in VB or VBA, except for VBA, the integration with the host Microsoft application (Excel, ...).
I didn't understand everything, but in VBS, you could use split(), which creates an array from a string based on a separator passed as a parameter.
VBS is much more "rich" than a batch file, it allows you to do everything you can do in VB or VBA, except for VBA, the integration with the host Microsoft application (Excel, ...).
Good evening,
Absolutely, and probably also through other "elegant" means (Python, Linux utilities like sed, but which were excluded anyway because I'm not the user and have to work with what I have...).
But apart from the fact that, as I said, I don't understand anything about VBS, I don't have access to the data file, so I don’t have much chance of understanding what needs to be extracted and in what format, nor can I test anything since I also don’t know the original date format in the database.
Absolutely, and probably also through other "elegant" means (Python, Linux utilities like sed, but which were excluded anyway because I'm not the user and have to work with what I have...).
But apart from the fact that, as I said, I don't understand anything about VBS, I don't have access to the data file, so I don’t have much chance of understanding what needs to be extracted and in what format, nor can I test anything since I also don’t know the original date format in the database.
The VBS extracts data from a database, apparently concerning audio transcriptions, to a CSV file.
I do not have the database, and I cannot upload the VBS or an excerpt of the CSV, as they are confidential.
However, the only format declaration in the VBS is related to the date (and poorly configured since it does not output the desired format, but it is useless for me to try to change it since I do not have the database to test).
The accented characters are either already present in the database or, similar to what happens in Batch, extracted in the wrong character table; there is only one specific case, regarding data that may have multiple origins, where there is a split syntax for potentially accented names, never for the address, and without apparent format:
strNomClient = Split(strLine, separateurFichierExport)(14)
I do not have the database, and I cannot upload the VBS or an excerpt of the CSV, as they are confidential.
However, the only format declaration in the VBS is related to the date (and poorly configured since it does not output the desired format, but it is useless for me to try to change it since I do not have the database to test).
The accented characters are either already present in the database or, similar to what happens in Batch, extracted in the wrong character table; there is only one specific case, regarding data that may have multiple origins, where there is a split syntax for potentially accented names, never for the address, and without apparent format:
strNomClient = Split(strLine, separateurFichierExport)(14)
That's where I don't get it, my head is all muddled today: if I redirect the first 2 columns like I did in an output file and add %%c, it's only going to give me the 3rd column, not the remaining 68?
@ECHO OFF SETLOCAL ENABLEDELAYEDEXPANSION ( FOR /f "tokens=1,2,* usebackq delims=|" %%a IN ("input.csv") DO ( SET "Row2=%%b" SET "Row2=!Row2:-=!" SET "Row2=!Row2: =!" SET "Row2=!Row2::=!" SET "Row2=!Row2:.=!" SET "Row2=!Row2:~0,12! SET "Row1=%%a" ECHO "%%~a"^|!Row2!^|%%c>>output.csv ) ) GOTO :EOFThank you.
If you have a lot of variables to handle, you could also have used a function that takes as many parameters as you have variables, then you pass from one parameter to the next using the shift command.