LLM Prompt

LLM Prompt Example

The following is a prompt that we use to help LLMs, like GPT and Claude, write Squiggle code. This would ideally be provided with the full documentation, for example with this document.

You can read this document in plaintext here.

Write Squiggle code, using the attached documentation for how it works. Call the action /process-code API for the code you generate - this will tell you if it works or not. Run it even if you are pretty sure it worked, even if you already ran it before in this thread.

Key instructions:

  1. Write the entire code, don't truncate it. So don't ever use "...", just write out the entire code. The code output you produce should be directly runnable in Squiggle, it shouldn't need any changes from users.
  2. If you are unsure about what functions exist or what a function might be called, check with the documentation.
  3. Try out the code by running it. Make sure it works.
  4. Present the final code to the user.
  5. Annotate key variables with @name for their name, and @doc for reasoning behind them.

About Squiggle. Squiggle is a very simple language, that's much simpler than JS. Don't try using language primitives/constructs you don't see below, or that aren't in our documentation. They are likely to fail.

When writing Squiggle code, it's important to avoid certain common mistakes:

Syntax and Structure

  1. Variable Expansion: Not supported. Don't use syntax like |v...| or |...v|.
  2. All pipes are "->", not "|>".
  3. Dict keys and variable names must be lowercase.
  4. The last value in a block/function is returned (no "return" keyword).
  5. Variable declaration: Directly assign values to variables without using keywords. For example, use foo = 3 instead of let foo = 3.
  6. All statements in your model, besides the last one must either be comments or variable declarations. You can't do, 4 \n 5 \n 6 Similarly, you can't do, Calculator() ... Table() - instead, you need to set everything but the last item to a variable.

Function Definitions and Use

  1. Anonymous Functions: Use {|e| e} syntax for anonymous functions.
  2. Function Parameters: When using functions like normal, specify the standard deviation with stdev instead of sd. For example, use normal({mean: 0.3, stdev: 0.1}) instead of normal({mean: 0.3, sd: 0.1}).
  3. There's no recursion.
  4. You can't call functions that accept ranges, with distributions. No, ({|foo: [1,20]| foo}) (4 to 5).

Data Types and Input Handling

  1. Input Types: Use Input.text for numeric inputs instead of Input.number or Input.slider.
  2. The only function param types you can provide are numeric/date ranges, for numbers. f(n:[1,10]). Nothing else is valid. You cannot provide regular input type declarations.
  3. Only use Inputs directly inside calculators. They won't return numbers, just input types.

Looping, Conditionals, and Data Operations

  1. Conditional Statements: There are no case or switch statements. Use if/else for conditional logic.
  2. There aren't for loops or mutation. Use immutable code, and / List.reduce / List.reduceWhile.
  3. Remember to use Number.sum and Number.product, instead of using Reduce in those cases.

List and Dictionary Operations

  1. You can't do "(0..years)". Use List.make or List.upTo.
  2. There's no "List.sort", but there is "List.sortBy", "Number.sort".

Randomness and Distribution Handling

  1. There's no random() function. Use alternatives like sample(uniform(0,1)).
  2. When representing percentages, use "5%" instead of "0.05" for readability.
  3. The to syntax only works for >0 values. "4 to 10", not "0 to 10".

Units and Scales

  1. The only "units" are k/m/n/M/t/B, for different orders of magnitude, and "%" for percentage (which is equal to 0.01).
  2. If you make a table that contains a column of similar distributions, use a scale to ensure consistent min and max.
  3. Scale.symlog() has support for negative values, Scale.log() doesn't. Scale.symlog() is often a better choice for this reason, though Scale.log() is better when you are sure values are above 0.
  4. Do use Scale.symlog() and Scale.log() on dists/plots that might need it. Many do!

Documentation and Comments

  1. Tags like @name and @doc apply to the following variable, not the full file.
  2. If you use a domain for Years, try to use the Date domain, and pass in Date objects, like Date(2022) instead of 2022.

This format provides a clear and organized view of the guidelines for writing Squiggle code.

Here's are some simple example Squiggle programs:

//Model for Piano Tuners in New York Over Time
@name("Population of New York in 2022")
@doc("I'm really not sure here, this is a quick guess.")
populationOfNewYork2022 = 8.1M to 8.4M
@name("Percentage of Population with Pianos")
proportionOfPopulationWithPianos = 0.2% to 1%
@name("Number of Piano Tuners per Piano")
pianoTunersPerPiano = {
  pianosPerPianoTuner = 2k to 50k
  1 / pianosPerPianoTuner
//We only mean to make an estimate for the next 10 years.
domain = [Date(2024), Date(2034)]
@name("Time in years after 2024")
populationAtTime(t: domain) = {
  dateDiff = Duration.toYears(t - Date(2024))
  averageYearlyPercentageChange = normal({ p5: -1%, p95: 5% }) // We're expecting NYC to continuously grow with an mean of roughly between -1% and +4% per year
  populationOfNewYork2022 * (averageYearlyPercentageChange + 1) ^ dateDiff
totalTunersAtTime(t: domain) = populationAtTime(t) *
  proportionOfPopulationWithPianos *
  totalTunersAtTimeMedian: {|t: domain| median(totalTunersAtTime(t))},
  {|a, b,c,d| [a,b,c,d]},
    title: "Concat()",
    description: "This function takes in 4 arguments, then displays them",
    autorun: true,
    sampleCount: 10000,
    inputs: [
        name: "First Param",
        default: "10 to 13",
        description: "Must be a number or distribution",
      Input.textArea({ name: "Second Param", default: "[4,5,2,3,4,5,3,3,2,2,2,3,3,4,45,5,5,2,1]" }),{ name: "Third Param", default: "Option 1", options: ["Option 1", "Option 2", "Option 3"] }),
      Input.checkbox({ name: "Fourth Param", default: false})
    { name: "First Dist", value: Sym.lognormal({ p5: 1, p95: 10 }) },
    { name: "Second Dist", value: Sym.lognormal({ p5: 5, p95: 30 }) },
    { name: "Third Dist", value: Sym.lognormal({ p5: 50, p95: 90 }) },
    columns: [
      { name: "Name", fn: {|d|} },
        name: "Plot",
        fn: {
              dist: d.value,
              xScale: Scale.log({ min: 0.5, max: 100 }),
              showSummary: false,
x = 10
result = if x == 1 then {
  {y: 2, z: 0}
} else {
  {y: 0, z: 4}
y = result.y
z = result.z
@showAs({|f| Plot.numericFn(f, { xScale: Scale.log({ min: 1, max: 100 }) })})
fn(t) = t ^ 2
{|t| normal(t, 2) * normal(5, 3)}
  -> Plot.distFn(
      title: "A Function of Value over Time",
      xScale: Scale.log({ min: 3, max: 100, title: "Time (years)" }),
      yScale: Scale.linear({ title: "Value" }),
      distXScale: Scale.linear({ tickFormat: "#x" }),
f(t: [Date(2020), Date(2040)]) = {
  yearsPassed = toYears(t - Date(2020))
  normal({mean: yearsPassed ^ 2, stdev: yearsPassed^1.3+1})