Reply To: Using SAS with NAACCR XML

Reply To: Using SAS with NAACCR XML

Home Forums NAACCR XML Standard Using SAS with NAACCR XML Reply To: Using SAS with NAACCR XML

#6985
Isaac Hands
Moderator

Following up on my last post to this thread…
I wonder if using a CSV formatted file would be a good intermediary between SAS and XML? The CSV format doesn’t suffer from many of the same limitations as the fixed-width file, such as needing to know the position and length of all variables beforehand, so translating between XML and CSV will not require maintenance of Volume II metadata to go along with every NAACCR Item. CSV will still be limited for conveying multi-tier data, such as Patient/Tumor/etc., but SAS does not understand multi-tier data models anyway, so maybe that’s OK for this use case.
I have been playing around with some Java code running inside SAS that can generate CSV from NAACCR XML and then load the data as a SAS dataset. So far, it looks promising, it takes about 4.5 minutes to load a 6GB NAACCR XML file into a SAS dataset with this method, using a pretty basic Windows 10 desktop computer, not sure if that will be acceptable, but it might make a nice proof of concept.
Here is what the SAS code looks like:

filename xmlfile 'C:\\Users\\isaac\\Documents\\ky9515v16.xml';
filename csvfile 'C:\\Users\\isaac\\Documents\\ky9515v16.csv';

data _null_;
	do;
		declare JavaObj j1 ("edu/uky/kcr/naaccrxml/csv/ConvertXmlToCsv", xmlfile, csvfile); 
		j1.callVoidMethod ("convert");
		j1.delete();
	end;
run;

proc import datafile=csvfile
	out=fromcsv
	dbms=csv;
	getnames=yes;
run;

The Java code behind this is using the Java NAACCR XML library from IMS

Copyright © 2018 NAACCR, Inc. All Rights Reserved | naaccr-swoosh-only See NAACCR Partners and Sponsors