0

I want to create a path at run-time and check whether the path and file is exist or not in ADLS. As per my understanding using U-SQL its not possible to generate the path at run time for e.g. -

DECLARE @filePath string = @"/temppath";
FILE.EXISTS(@filePath "/" DateTime.UtcNow.AddDays(-numberofdays).Year "/" DateTime.UtcNow.AddDays(-numberofdays).Month "/" DateTime.UtcNow.AddDays(-numberofdays).day "/test.csv") as [outputVal]

For this I have written UDF but even there I have to pass the initial path i.e. @filePath of ADLS. Please let me know how to do it.

Evaldas Buinauskas
  • 13,739
  • 11
  • 55
  • 107
Rahul Wagh
  • 281
  • 6
  • 20

1 Answers1

0

FILE.EXISTS() is a compile-time intrinsic and will not check at runtime. Also, even with a UDF, you cannot look at the file system, since user-defined code cannot reach outside of the container.

Also note that a script is declarative in nature and is not necessarily executed in the order you write the expressions.

What is it that you are trying to achieve?

Michael Rys
  • 6,684
  • 15
  • 23
  • Thanks Michael for your quick reply. Based on columns from the input file I am creating a path and inside that path I want to check whether the file is present or not. The path & file is on ADLS which is linked to ADLA. So I don't want to explicitly send the base path of ADLS (like adl://*****adls.azuredatalakestore.net/" to UDF. If ADLS is linked with ADLA and I am running U-SQL script on ADLA why the UDF is not able to recognised that the base path is from the linked ADLS only. – Rahul Wagh Apr 20 '18 at 20:19
  • Note that the store is behind its own end points. And for security reasons you do not want to give unprotected access from user-code to the store, so you would have to go through these end-points, at which point it is outside of the trust boundary. There will be two options: One fast, one slow. – Michael Rys Apr 21 '18 at 00:44
  • The fast one: You write one script in U-SQL to generate the second script in U-SQL, thus making the paths in the second script static so the IF (FILE.EXISTS()) can be evaluated at compile time. This will reduce the processing to just the files you want to operate on. – Michael Rys Apr 21 '18 at 00:45
  • The slower one: Assuming that you have some understanding of the structure, use the EXTRACT over a file set and use the values in the file to filter on the virtual columns. I say slow here because the runtime join will not lead to a "partition elimination". So if you have way more files than what you actually want to work on matching the pattern, you will read a lot of data that you then will discard. – Michael Rys Apr 21 '18 at 00:48
  • Michael, can you please provide the code snippet to achieve the fast option.Also is there any document related to best practices about U-sql, udf and udo. I will be using u-sql extensively in my work and since u-sql is new not getting most of the answers. – Rahul Wagh Apr 21 '18 at 06:43
  • There is an example at https://stackoverflow.com/questions/42636855/u-sql-output-in-azure-data-lake/42676271#42676271 that shows how to use U-SQL to generate another script (for a different question but you should be able to adapt it). You can start at http://usql.io for links to docs and samples. We also have videos on Channel 9 and my slides are on http://www.slideshare.net/MichaelRys – Michael Rys Apr 23 '18 at 17:24
  • I written two u-sql. 1. I create dynamic path and save that ouputput in one csv file. That is the output of 1st u-sql 2. The output of first sql I am using in 2nd u-sql as it is that is not making any changes.There the path I am using as a parameter for FILE.EXISTS and still I am getting expression can not be evaluated at compile time error message. To do a simple FILE check for u-sql it is not a simple task. – Rahul Wagh Apr 24 '18 at 15:26
  • The first script needs to generate the full second script so the path name is a static value in the second script. You cannot read a value from a file at compile time. – Michael Rys Apr 24 '18 at 17:28
  • Its confusing. Can you provide the code snippet for this use case. – Rahul Wagh Apr 24 '18 at 17:55
  • There are hundreds of record in file and for each record I am generating dynamic path and next task is to check whether the file is exist or not in the corresponding path. – Rahul Wagh Apr 25 '18 at 10:26
  • can you please send me a sample input file to mrys at microsoft dot com? I can try to find some time to write a simple skeleton example based on that. – Michael Rys Apr 26 '18 at 18:40
  • sent details please check – Rahul Wagh Apr 27 '18 at 15:11
  • Michael,did you get chance to look into the details I sent. – Rahul Wagh Apr 30 '18 at 17:28
  • Sorry, not yet. I hope later this week. Thanks for forwarding. – Michael Rys May 01 '18 at 19:57
  • Sorry, I was swamped with the build conference and some other high priority work. I will get to it hopefully next week (which looks much better!) – Michael Rys May 11 '18 at 21:01