Quantcast
Channel: Desktop topics
Viewing all articles
Browse latest Browse all 216047

Poor performance importing text file in Power Query

$
0
0

I'm dealing with a 122MB .txt file.


It contains ~20k rows and 4.3k columns (yes I thought that was absurd too), separated by semicolons.


I did some cleaning to extract the information I needed, with the first step being splitting the column by the semicolon delimeter, and I use the resulting data to merge it with another query. However, loading the data is taking way too long (8-10 minutes).

  

Any ideas on how to improve performance?


Here is my M code (I cut off the text in the first two steps because it lists all the 4.3k columns.

 

 

let
    Source = Table.FromColumns({Lines.FromBinary(File.Contents("G:\exporta.txt"),null,null,1252)}),
    #"Split Column by Delimiter" = Table.SplitColumn(Source,"Column1",Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv),{"Column1.1", .....
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1.1", type text}, .....
    #"Promoted Headers" = Table.PromoteHeaders(#"Changed Type"),
    #"Removed Other Columns" = Table.SelectColumns(#"Promoted Headers",{"_Data", "_Hora", "cScore", "_ID_log", "CNPJ", "CPF"}),
    #"Merged Columns" = Table.CombineColumns(#"Removed Other Columns",{"CNPJ", "CPF"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"CPFCNPJ"),
    #"Reordered Columns" = Table.ReorderColumns(#"Merged Columns",{"CPFCNPJ", "_Data", "_Hora", "cScore", "_ID_log"}),
    #"Replaced Value" = Table.ReplaceValue(#"Reordered Columns",".","",Replacer.ReplaceText,{"CPFCNPJ"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value","-","",Replacer.ReplaceText,{"CPFCNPJ"}),
    #"Replaced Value2" = Table.ReplaceValue(#"Replaced Value1","/","",Replacer.ReplaceText,{"CPFCNPJ"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Replaced Value2",{{"CPFCNPJ", Int64.Type}, {"_Data", type date}, {"_Hora", type time}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type1", "Data_Hora", each [_Data]&[_Hora]),
    #"Changed Type2" = Table.TransformColumnTypes(#"Added Custom",{{"Data_Hora", type datetime}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type2",{"_Data", "_Hora"}),
    #"Reordered Columns1" = Table.ReorderColumns(#"Removed Columns",{"CPFCNPJ", "Data_Hora", "cScore", "_ID_log"}),
    #"Merged Columns1" = Table.CombineColumns(Table.TransformColumnTypes(#"Reordered Columns1", {{"CPFCNPJ", type text}, {"Data_Hora", type text}}, "pt-BR"),{"CPFCNPJ", "Data_Hora"},Combiner.CombineTextByDelimiter(";", QuoteStyle.None),"Merged")
in
    #"Merged Columns1"


Viewing all articles
Browse latest Browse all 216047

Trending Articles