# AutoMan API Reference

## Supported Question Types

| Question Type | Purpose                                                                                                                  | Quality-Controlled | Number of Answers Returned |
| ------------- | ------------------------------------------------------------------------------------------------------------------------ | ------------------ | -------------------------- |
| `radio`       | The user is asked to choose one of n options.                                                                            | yes                | 1                          |
| `checkbox`    | The user is asked to choose one of m of n options, where m <= n.                                                         | yes                | 1                          |
| `freetext`    | The user is asked to enter a textual response, such that the response conforms to a simple pattern (a "picture clause"). | yes                | 1                          |
| `estimate`    | The user is asked to enter a numeric (real-valued) response.                                                             | yes                | 1                          |
| `radios`      | Same as `radio`, except that it returns the entire distribution.                                                         | no                 | sample size                |
| `checkboxes`  | Same as `checkbox`, except that it returns the entire distribution.                                                      | no                 | sample size                |
| `freetexts`   | Same as `freetext`, except that it returns the entire distribution.                                                      | no                 | sample size                |

{% hint style="info" %}
The primary difference between "quality controlled" and "non-quality controlled" questions is whether you want a single, quality-controlled answer, or all of the answers.  The former is useful in batch computation, where you are relying on the "wisdom of the crowd" to choose the best answer.  The latter is used to obtain *i.i.d.* samples of the crowd.
{% endhint %}

### Calling a Question Type

We describe question type signatures below.  It is important to note that calling a question type constructor *immediately launches a crowdsoucing task*.  This is not usually what you want, which is why [all of our sample applications](https://github.com/automan-lang/AutoMan/tree/master/apps) utilize the following pattern:

```scala
def my_function(<arg>, ...) = <AutoMan constructor>(<configuration>)
```

For example, here is a sample human function for calorie counting:

```scala
def howManyCals(imgUrl: String) = estimate (
    budget = 6.00,
    confidence_interval = SymmetricCI(50),
    text = "Estimate how many calories (kcal) are " +
           "present in the picture shown in the photo.",
    image_url = imgUrl,
    min_value = 0
)
```

Observe how we use a Scala user-defined function (`def`) to pass the `imgUrl` parameter through to the `estimate` constructor.  See our [sample apps](https://github.com/automan-lang/AutoMan/tree/master/apps) for additional examples.

### Question Return Types

Another thing to note is that all AutoMan question constructors return a result belonging to the supertype `Outcome`.  Although you can call `toString` on such return values to obtain a simple, printable string, you should probably pattern-match on the result value returned by calling `answer` (or `answers`, depending on the question) on the returned `Outcome` object.  Each question type has a different set of possible return values.  We describe them in the next section.

You are encouraged to look at the [sample apps](https://github.com/automan-lang/AutoMan/tree/master/apps) for examples.

{% hint style="info" %}
AutoMan question function constructors return *immediately* and run *asynchronously* in a background thread.  This is an intentional design decision to allow you to start a crowdsourcing job and do other work while the task runs.  Calling `answer` (or `answers`, depending on the question type) will *block* execution until the task is done running, which may be a substantial amount of time.  Be sure that you want blocking behavior when you call `answer`.
{% endhint %}

{% hint style="warning" %}
The `toString` method for `Outcome` calls `answer` internally, which means that it blocks!
{% endhint %}

### Question Type Constructor Signatures

We provide question constructors here.  Note that all of them take a very large number of parameters, but that most of those parameters are the same between question types and nearly all of them have "sane defaults."  Defaults are managed by requiring the use of [named arguments](https://docs.scala-lang.org/tour/named-arguments.html).

Therefore, we provide two constructor signatures for each question type: the 1) simplified constructor showing only mandatory parameters, and 2) the full ScalaDoc-generated constructor with all parameters.  We also describe common parameters at the end.

We describe the variants used in the `mturk` DSL here.

#### Radio Button Questions

The following constructor parameters are mandatory:

```scala
def radio(
  options: List[MTQuestionOption],
  text: String
  (implicit a: A): ScalarOutcome[Symbol] 
```

* `options` are the selection options seen by the user, along with optional images.  Options can be created using one of the following `choice` constructors:

  * `choice(key: Symbol, text: String)` or&#x20;
  * `choice(key: Symbol, text: String, image_url: String)`

  where `key` denotes a stable identifier for a choice (e.g., `kermit`) not shown to the worker, `text` is the text label shown to the worker, and `image_url` is a url of an image shown beside the text label.
* `text` is the text of the question shown in a `HIT` and, by default, also as the task title.  You can override the title by setting the `title` parameter.

Radio button questions can return the following values:

* `Answer[Symbol]`: An object that represents a selected radio button, where each possible `Symbol` was defined with the `key` parameter of the `choice` constructors described above.  This object has the following fields:
  * `value`: the answer (`Symbol`).
  * `cost`: the total cost (`BigDecimal`)
  * `confidence`: the final confidence value (`Double`).
  * `distribution`: raw sample responses (`Array[Response[Symbol]]`)
* `LowConfidenceAnswer`, which has the same fields as `Answer` but which indicates that a quality-controlled response has a `confidence` lower than the desired threshold.
* `OverBudgetAnswer`, which indicates that a specified task cannot run at all due to insufficient funds.  This object has the following fields:
  * `need`: the funds needed to start a job (`BigDecimal`)
  * `have`: the funds at hand (`BigDecimal`)
* `NoAnswer`, which indicates that an unexpected runtime error occurred.

The following is a ScalaDoc-generated signature:

```scala
def radio[A <: AutomanAdapter, O](
  confidence: Double = MagicNumbers.DefaultConfidence,
  budget: BigDecimal = MagicNumbers.DefaultBudget,
  dont_reject: Boolean = true,
  dry_run: Boolean = false,
  image_alt_text: String = null,
  image_url: String = null,
  initial_worker_timeout_in_s: Int = ...,
  minimum_spawn_policy: MinimumSpawnPolicy = null,
  mock_answers: Iterable[MockAnswer[Symbol]] = null,
  options: List[AnyRef],
  pay_all_on_failure: Boolean = true,
  question_timeout_multiplier: Double = ...,
  text: String,
  title: String = null,
  wage: BigDecimal = MagicNumbers.USFederalMinimumWage)
  : ScalarOutcome[Symbol] 
```

#### Checkbox Questions

The following constructor parameters are mandatory:

```scala
def checkbox(
  options: List[MTQuestionOption],
  text: String)
  : ScalarOutcome[Set[Symbol]] 
```

* `options` are the selection options seen by the user, along with optional images.  Options can be created using one of the following `choice` constructors:

  * `choice(key: Symbol, text: String)` or&#x20;
  * `choice(key: Symbol, text: String, image_url: String)`

  where `key` denotes a stable identifier for a choice (e.g., `kermit`) not shown to the worker, `text` is the text label shown to the worker, and `image_url` is a url of an image shown beside the text label.
* `text` is the text of the question shown in a `HIT` and, by default, also as the task title.  You can override the title by setting the `title` parameter.

Checkbox questions can return the following values:

* `Answer[Set[Symbol]]`: An object that represents a set of selected checkboxes, where each `Symbol` was defined with the `key` parameter of the `choice` constructors described above.  This object has the following fields:
  * `value`: the answer (`Set[Symbol]`).
  * `cost`: the total cost (`BigDecimal`)
  * `confidence`: the final confidence value (`Double`).
  * `distribution`: raw sample responses (`Array[Response[Set[Symbol]]]`)
* `LowConfidenceAnswer`, which has the same fields as `Answer` but which indicates that a quality-controlled response has a `confidence` lower than the desired threshold.
* `OverBudgetAnswer`, which indicates that a specified task cannot run at all due to insufficient funds.  This object has the following fields:
  * `need`: the funds needed to start a job (`BigDecimal`)
  * `have`: the funds at hand (`BigDecimal`)
* `NoAnswer`, which indicates that an unexpected runtime error occurred.

The following is a ScalaDoc-generated signature:

```scala
def checkbox[A <: AutomanAdapter, O](
  confidence: Double = MagicNumbers.DefaultConfidence,
  budget: BigDecimal = MagicNumbers.DefaultBudget,
  dont_reject: Boolean = true,
  dry_run: Boolean = false,
  image_alt_text: String = null,
  image_url: String = null,
  initial_worker_timeout_in_s: Int = ...,
  minimum_spawn_policy: MinimumSpawnPolicy = null,
  mock_answers: Iterable[MockAnswer[Set[Symbol]]] = null,
  options: List[AnyRef],
  pay_all_on_failure: Boolean = true,
  question_timeout_multiplier: Double = ...,
  text: String,
  title: String = null,
  wage: BigDecimal = MagicNumbers.USFederalMinimumWage)
  (implicit a: A)
  : ScalarOutcome[Set[Symbol]] 
```

#### Free-Text Questions

The following constructor parameters are mandatory:

```
def freetext(
  pattern: String,
  text: String)
  : ScalarOutcome[String] 
```

* `pattern` is a COBOL-style *picture clause* pattern that states what inputs are valid.  AutoMan uses this pattern to perform probability calculations.  `A` matches an alphabetic character, `B` matches an optional alphabetic character, `X` matches an alphanumeric character, `Y` matches an optional alphanumeric character, `9` matches a numeric character, and `0`matches an optional numeric character. For example, a telephone number recognition application might use the pattern `09999999999`.
* `text` is the text of the question shown in a `HIT` and, by default, also as the task title.  You can override the title by setting the `title` parameter.

The following parameters are `freetext`-specific:

* `allow_empty_pattern` means that the empty string is a valid worker response.\
  **default**: `false`
* `before_filter` is not currently used.
* `pattern_error_text` is a helpful message that is displayed to the user when their input does not match a pattern.  It is not mandatory but it is highly recommended that you use this setting.

{% hint style="info" %}
You should strongly consider using `pattern_error_text` for `freetext` questions as the default MTurk help message is not helpful.  This parameter gives you an opportunity to provide an error explanation in non-technical terms.
{% endhint %}

Free-text questions can return the following values:

* `Answer[String]`: An object that represents a response string.  This object has the following fields:
  * `value`: the answer (`String`).
  * `cost`: the total cost (`BigDecimal`)
  * `confidence`: the final confidence value (`Double`).
  * `distribution`: raw sample responses (`Array[Response[String]]`)
* `LowConfidenceAnswer`, which has the same fields as `Answer` but which indicates that a quality-controlled response has a `confidence` lower than the desired threshold.
* `OverBudgetAnswer`, which indicates that a specified task cannot run at all due to insufficient funds.  This object has the following fields:
  * `need`: the funds needed to start a job (`BigDecimal`)
  * `have`: the funds at hand (`BigDecimal`)
* `NoAnswer`, which indicates that an unexpected runtime error occurred.

The following is a ScalaDoc-generated signature:

```scala
def freetext[A <: AutomanAdapter](
  allow_empty_pattern: Boolean = false,
  confidence: Double = MagicNumbers.DefaultConfidence,
  before_filter: (String) ⇒ String = (a: String) => a,
  budget: BigDecimal = MagicNumbers.DefaultBudget,
  dont_reject: Boolean = true,
  dry_run: Boolean = false,
  image_alt_text: String = null,
  image_url: String = null,
  initial_worker_timeout_in_s: Int = ...,
  minimum_spawn_policy: MinimumSpawnPolicy = null,
  mock_answers: Iterable[MockAnswer[String]] = null,
  pay_all_on_failure: Boolean = true,
  pattern: String,
  pattern_error_text: String = null,
  question_timeout_multiplier: Double = ...,
  text: String,
  title: String = null,
  wage: BigDecimal = MagicNumbers.USFederalMinimumWage)
  (implicit a: A)
  : ScalarOutcome[String] 
```

#### Estimates

There is [an entire paper](https://docs.automanlang.org/technical-documentation/papers) (VoxPL) about this one question type.

The following constructor parameters are mandatory:

```scala
def estimate(
  confidence_interval: ConfidenceInterval,
  text: String)
  : EstimationOutcome 
```

* `confidence_interval` lets you denote the confidence interval of an estimate.  The options are:
  * `UnconstrainedCI()` which will only even perform one round of tasks using the default sample size, returning the $$L\_1$$ median.
  * `SymmetricCI(err: Double)` which returns the $$L\_1$$ median $$\pm$$ `err` with `confidence` level confidence.
  * `AsymmetricCI(lerr: Double, herr: Double)` which returns the $$L\_1$$ median of an estimate between `-lerrr` and `+herr` with `confidence` level confidence.
* `text` is the text of the question shown in a `HIT` and, by default, also as the task title.  You can override the title by setting the `title` parameter.

Estimates can return the following values:

* `Estimate`: An object that represents a "best estimate".  This object has the following fields:
  * `value`: the estiamte (`Double`).
  * `low`: the low bound of a confidence interval's estimate (`Double`).
  * `high`: the high bound of a confidence interval's estimate (`Double`).
  * `cost`: the total cost (`BigDecimal`)
  * `confidence`: the final confidence value (`Double`).
  * `distribution`: raw sample responses (`Array[Response[Double]]`)
* `LowConfidenceEstimate`, which has the same fields as `Estimate` but which indicates that a quality-controlled response has a `confidence` lower than the desired threshold.
* `OverBudgetEstimate`, which indicates that a specified task cannot run at all due to insufficient funds.  This object has the following fields:
  * `need`: the funds needed to start a job (`BigDecimal`)
  * `have`: the funds at hand (`BigDecimal`)
* `NoEstimate`, which indicates that an unexpected runtime error occurred.

The following is a ScalaDoc-generated signature:

```scala
def estimate[A <: AutomanAdapter](
  confidence_interval: ConfidenceInterval = UnconstrainedCI(),
  confidence: Double = MagicNumbers.DefaultConfidence,
  budget: BigDecimal = MagicNumbers.DefaultBudget,
  default_sample_size: Int = -1,
  dont_reject: Boolean = true,
  dry_run: Boolean = false,
  estimator: (Seq[Double]) ⇒ Double = null,
  image_alt_text: String = null,
  image_url: String = null,
  initial_worker_timeout_in_s: Int = ...,
  max_value: Double = Double.MaxValue,
  minimum_spawn_policy: MinimumSpawnPolicy = null,
  min_value: Double = Double.MinValue,
  mock_answers: Iterable[MockAnswer[Double]] = null,
  pay_all_on_failure: Boolean = true,
  question_timeout_multiplier: Double = ...,
  text: String,
  title: String = null,
  wage: BigDecimal = MagicNumbers.USFederalMinimumWage)
  (implicit a: A)
  : EstimationOutcome 
```

#### Sampling Questions

We describe the `checkboxes` constructor here, but `freetexts` and `radios` are similar.  There is also a buggy `multiestimate` constructor that should probably not be used at the moment.

The following constructor parameters are mandatory:

```scala
def checkboxes(
  sample_size: Int = ...,
  options: List[MTQuestionOption],
  text: String)
  : VectorOutcome[Set[Symbol]] 
```

* `sample_size` is the size of the sample.
* `options` are the selection options seen by the user, along with optional images.  Options can be created using one of the following `choice` constructors:

  * `choice(key: Symbol, text: String)` or&#x20;
  * `choice(key: Symbol, text: String, image_url: String)`

  where `key` denotes a stable identifier for a choice (e.g., `kermit`) not shown to the worker, `text` is the text label shown to the worker, and `image_url` is a url of an image shown beside the text label.
* `text` is the text of the question shown in a `HIT` and, by default, also as the task title.  You can override the title by setting the `title` parameter.

The following is a ScalaDoc-generated signature:

```scala
def checkboxes[A <: AutomanAdapter, O](
  sample_size: Int = ...,
  budget: BigDecimal = MagicNumbers.DefaultBudget,
  dont_reject: Boolean = true,
  dry_run: Boolean = false,
  image_alt_text: String = null,
  image_url: String = null,
  initial_worker_timeout_in_s: Int = ...,
  minimum_spawn_policy: MinimumSpawnPolicy = null,
  mock_answers: Iterable[MockAnswer[Set[Symbol]]] = null,
  options: List[AnyRef],
  pay_all_on_failure: Boolean = true,
  question_timeout_multiplier: Double = ...,
  text: String,
  title: String = null,
  wage: BigDecimal = MagicNumbers.USFederalMinimumWage)
  (implicit a: A): VectorOutcome[Set[Symbol]] 
```

#### Common Default Parameters

The following are parameters common to all calls:

* `budget` is the total amount of money to be spent by a *given* human question function call.  Note that this means that *each function call* has its own budget.\
  **default**: $5.00
* `dont_reject`, when set to `true`, will always accept completed assignments and pay workers for their work.  This is useful when work is difficult and errors are likely, or when you just don't want to deal with the hassle of reputation management. \
  **default**: `false`
* `dry_run`, when set to `true`, will not actually post jobs on MTurk.\
  **default**: `false`
* `image_alt_text` adds an HTML `ALT` annotation to the `IMG` tag created by the `image_url` parameter.\
  **default**: none (`null`)
* `image_url` adds an image to a question.  Such images should be hosted someplace publically-accessible, such as Amazon S3 or a personal website.\
  **default**: none (`null`)
* `initial_worker_timeout_in_s` is the amount of time permitted to a worker in the initial round of tasks.  Note that the actual time permitted depends on the number of rounds and is determined by the quality control policy.  The default policy uses the formula $$w m^r$$ , where $$w$$ is the `initial_worker_timeout_in_s`, $$m=2$$, and $$r$$ is the round.  In other words, task timeout are doubled.\
  **default**: 30 seconds
* `minimum_spawn_policy` states what the smallest number of assignments for a given `HIT` are on MTurk.  This is necessary because MTurk has two totally boneheaded policies:

  * HITs posted with 10 or fewer assignments are charged a 20% fee while HITs with more than 10 assignments are charged a 40% fee.
  * HITs with 10 or fewer assignments cannot be "extended" to have more assignments.

  For now, what this means is that, if you do not change the default, AutoMan will post tasks with at least 10 assignments.  If you anticipate that your tasks will likely need fewer than 10 assignments, you can set the anticipated amount by setting this to `UserDefinableSpawnPolicy(n)` where `n` is the number you want.\
  **default**: 10\
  **note**: I am actively unhappy about this and am thinking of ways to simplify it.  [Suggestions welcome](https://github.com/automan-lang/AutoMan/issues).
* `mock_answers` sets AutoMan to be used in *mock* mode for testing purposes.  This is used interally by AutoMan for testing.  You should not change this.\
  **default**: `null`
* `pay_all_on_failure` controls whether workers are paid when a task runs out of money.  Setting this to `false` means that workers will not be paid when an `OverBudget` result is returned, which generally makes workers unhappy.\
  **default**: `true`
* `question_timeout_multiplier` controls how much time a HIT exists on MTurk before it is timed out.  Note that this is a distinct timeout from the amount of time a worker is given to complete a task, which is controlled by the `initial_worker_timeout_in_s` parameter.  A HIT's total time is determined by the formula $$w m^r t$$ , where $$w$$ is the `initial_worker_timeout_in_s`, $$m=2$$, $$r$$ is the round, and $$t$$ is the `question_timeout_multiplier`.

  \
  **default**: 500
* `wage` controls the base wage for a worker.  The actual reward paid depends on how much time a worker is given to do a task.  The default policy uses a maximum likelihood estimate of the probability that a task is accepted in order to compute a wage that disincentivzes wage gaming behavior.  It is complicated enough that if you want to know its inner workings, you should [read our 2016 CACM article](https://docs.automanlang.org/technical-documentation/papers). Generally you should think of the reward as "probably doubling."\
  **default**: the U.S. Federal Minimum wage, or $7.25/hour
* `a` is an initialized AutoMan platform adapter.  Typically this will be an `implicit` variable that you return from a platform initializer expression like `mturk`.  When marked `implicit`, you do not need to pass the parameter yourself; Scala will find it in the environment and pass it, simplifying human function calls.  AutoMan needs this information in order to bind a human function call to a given crowdsourcing platform.

## Using AutoMan with a Different Crowdsourcing Backend

We currently only support Amazon's Mechanical Turk. However, AutoMan was designed to accommodate arbitrary backends. If you are interested in seeing your crowdsourcing platform supported, please contact us.

#### Memoization

AutoMan can be configured to save all intermediate human-computed results.  Set the location of the database with `database_path = "/path/to/your/database"`. The format of the database is H2.

## Sample Applications <a href="#sample_apps" id="sample_apps"></a>

Sample applications can be found in the `apps` directory. Apps can also be built using `pack`. E.g.,

```bash
$ cd apps/simple_program
$ sbt pack
```

Unix/DOS shell scripts for running the programs can then be found in `apps/[the app]/target/pack/bin/`.
