Questions tagged [palantir-foundry]

Palantir Foundry is a web-based data analytics and decision modeling SaaS platform. Use this tag for questions about building your own models in Foundry using Python, R, or SQL or working with the Foundry API.

Palantir Foundry is a web-based data analytics and decision modeling SaaS platform. Use this tag for questions about building your own models in Foundry using Python, R, or SQL or working with the Foundry API.

731 questions
9
votes
2 answers

What is the difference between transform & transform_df in Palantir Foundry?

Can someone explain why we need transform & transform_df methods separately?
sumitkanoje
  • 1,217
  • 14
  • 22
7
votes
2 answers

Why is my build hanging / taking a long time to generate my query plan with many unions?

I notice when I run the same code as my example over here but with a union or unionByName or unionAll instead of the join, my query planning takes significantly longer and can result in a driver OOM. Code included here for reference, with a slight…
7
votes
1 answer

How is Workshop different from Slate in Foundry?

I see that there are Slate and Workshop in the Foundry platform. May I know real business cases where we can use Workshop and Slate? How are these different and where can these best fit? can anyone shed some light on this?
Ram D
  • 167
  • 1
  • 8
6
votes
1 answer

Code Repository - What exactly is CTX in pyspark for a code repo?

I have seen the use of ctx in a code repo, what exactly is this? Is it a built in library? When would I use it? I've seen it in examples such as the following: df = ctx.spark.createdataframe(...
Robert F
  • 187
  • 5
5
votes
0 answers

How to prevent a sort on a groupby.applyInPandas using hash partitioning on the upstream dataset?

In my main transform, I'm running an algorithm by doing a groupby and then applyInPandas in Foundry. The build takes very long, and one idea is to organize the files to prevent shuffle reads and sorting, using Hash partitioning/bucketing. For a…
5
votes
1 answer

Best way to modify downstream references to a code workbook dataset to point to the new code repository dataset created using helper?

When using the "Export to Code Repository Helper" tool in an existing code workbook, what is the most efficient way to modify downstream dependencies to point to the newly created Code Repository dataset? We want to modify all downstream…
5
votes
2 answers

Displaying a PDF file stored in a Dataset on Palantir Foundry in Slate Application

I am trying to display PDF files in Slate on Palantir Foundry. I managed to display PDF files that are stored in a folder on Foundry without a schema, but not PDFs that are in a Dataset. Is there a way to display PDF files that are stored in a…
FloHab
  • 93
  • 6
5
votes
1 answer

How do I parse xml documents in Palantir Foundry?

I have a set of .xml documents that I want to parse. I previously have tried to parse them using methods that take the file contents and dump them into a single cell, however I've noticed this doesn't work in practice since I'm seeing slower and…
5
votes
3 answers

How to union multiple dynamic inputs in Palantir Foundry?

I want to Union multiple datasets in Palantir Foundry, the name of the datasets are dynamic so I would not be able to give the dataset names in transform_df() statically. Is there a way I can dynamically take multiple inputs into transform_df and…
5
votes
1 answer

How can i iterate over json files in code repositories and incrementally append to a dataset

I have imported a dataset with 100,000 raw json files of about 100gb through data connection into foundry. I want to use the Python Transforms raw file access transformation to read the files, Flatten array of structs and structs into a dataframe as…
5
votes
2 answers

How does Foundry Magritte append ingestion handle deleted rows in the data source?

If I have a Magritte ingestion that is set to append, will it detect if rows are deleted in the source data? Will it also delete the rows in the ingested dataset?
datawizard
  • 73
  • 2
5
votes
2 answers

How to create python libraries and how to import it in palantir foundry

In order to generalize the python functions, I wanted to add functions to python libraries so that I can use these function across the multiple repositories. Anyone please answer the below questions. 1) How to create our own python libraries 2) how…
4
votes
1 answer

Palantir Foundry forbids connection to external web service

I want to query weather information using the meteostat library. When doing so, however, Palantir Foundry forbids requests to the external web service. The code example below fails with the error message urllib.error.URLError:
stn53
  • 97
  • 6
4
votes
2 answers

How to do a recursive self-join in Foundry Contour?

I have a dataset which represents objects in a hierarchy (there are no cycles). I want to analyse it in Contour and figure out for each object the list of top-level related objects. Say, my object A depends on objects B and C. Object C in turn…
4
votes
1 answer

How can I get my dataset's name in code repository

When combining multiple datasets in Python in code repository, I want to put the dataset name in the first column. But I couldn't figure it out by accessing its path @transform_df( Output("/folder/folder1/datasets/mydatset"), df1 =…
1
2 3
48 49