Hi,
I have a large data set with 300+MB csv file and im trying to cut down refresh times,
what Im trying to do is delimit one of the columns, duplicate and recombine all the
delimited columns with the other associated colums still included.
E.G.
This would be the raw data:
Value Text
100 this is an example
300 this is another example
after delimiting only the text column I would get the following:
Value Text Text1 Text2 Text3 Text4
100 this is an example null
300 this is one more example
and then I want to duplicate this data so I can get total values for each word used like so:
Value Text
100 this
100 is
100 an
100 example
300 this
300 is
300 one
300 more
300 example
I've managed to do this by creating multiple references of the original data source, deleting corresponding columns and then recombining them but with the files being so large and some text values haveing up to 15 words it's taking a LONG amount of time to refresh each individual data source. As I only have one actual data source I presumed that the file would only have to refresh one data source but it is refreshing all of the references which is what is taking so long.
Can anyone help me to speed up this refresh please?
Maybe an alternative way of doing this might work better, any help is much appreciated!