AutoMan API Reference
Last updated
Was this helpful?
Last updated
Was this helpful?
Question Type
Purpose
Quality-Controlled
Number of Answers Returned
radio
The user is asked to choose one of n options.
yes
1
checkbox
The user is asked to choose one of m of n options, where m <= n.
yes
1
freetext
The user is asked to enter a textual response, such that the response conforms to a simple pattern (a "picture clause").
yes
1
estimate
The user is asked to enter a numeric (real-valued) response.
yes
1
radios
Same as radio
, except that it returns the entire distribution.
no
sample size
checkboxes
Same as checkbox
, except that it returns the entire distribution.
no
sample size
freetexts
Same as freetext
, except that it returns the entire distribution.
no
sample size
We describe question type signatures below. It is important to note that calling a question type constructor immediately launches a crowdsoucing task. This is not usually what you want, which is why utilize the following pattern:
For example, here is a sample human function for calorie counting:
Another thing to note is that all AutoMan question constructors return a result belonging to the supertype Outcome
. Although you can call toString
on such return values to obtain a simple, printable string, you should probably pattern-match on the result value returned by calling answer
(or answers
, depending on the question) on the returned Outcome
object. Each question type has a different set of possible return values. We describe them in the next section.
The toString
method for Outcome
calls answer
internally, which means that it blocks!
Therefore, we provide two constructor signatures for each question type: the 1) simplified constructor showing only mandatory parameters, and 2) the full ScalaDoc-generated constructor with all parameters. We also describe common parameters at the end.
We describe the variants used in the mturk
DSL here.
The following constructor parameters are mandatory:
options
are the selection options seen by the user, along with optional images. Options can be created using one of the following choice
constructors:
choice(key: Symbol, text: String)
or
choice(key: Symbol, text: String, image_url: String)
where key
denotes a stable identifier for a choice (e.g., kermit
) not shown to the worker, text
is the text label shown to the worker, and image_url
is a url of an image shown beside the text label.
text
is the text of the question shown in a HIT
and, by default, also as the task title. You can override the title by setting the title
parameter.
Radio button questions can return the following values:
Answer[Symbol]
: An object that represents a selected radio button, where each possible Symbol
was defined with the key
parameter of the choice
constructors described above. This object has the following fields:
value
: the answer (Symbol
).
cost
: the total cost (BigDecimal
)
confidence
: the final confidence value (Double
).
distribution
: raw sample responses (Array[Response[Symbol]]
)
LowConfidenceAnswer
, which has the same fields as Answer
but which indicates that a quality-controlled response has a confidence
lower than the desired threshold.
OverBudgetAnswer
, which indicates that a specified task cannot run at all due to insufficient funds. This object has the following fields:
need
: the funds needed to start a job (BigDecimal
)
have
: the funds at hand (BigDecimal
)
NoAnswer
, which indicates that an unexpected runtime error occurred.
The following is a ScalaDoc-generated signature:
The following constructor parameters are mandatory:
options
are the selection options seen by the user, along with optional images. Options can be created using one of the following choice
constructors:
choice(key: Symbol, text: String)
or
choice(key: Symbol, text: String, image_url: String)
where key
denotes a stable identifier for a choice (e.g., kermit
) not shown to the worker, text
is the text label shown to the worker, and image_url
is a url of an image shown beside the text label.
text
is the text of the question shown in a HIT
and, by default, also as the task title. You can override the title by setting the title
parameter.
Checkbox questions can return the following values:
Answer[Set[Symbol]]
: An object that represents a set of selected checkboxes, where each Symbol
was defined with the key
parameter of the choice
constructors described above. This object has the following fields:
value
: the answer (Set[Symbol]
).
cost
: the total cost (BigDecimal
)
confidence
: the final confidence value (Double
).
distribution
: raw sample responses (Array[Response[Set[Symbol]]]
)
LowConfidenceAnswer
, which has the same fields as Answer
but which indicates that a quality-controlled response has a confidence
lower than the desired threshold.
OverBudgetAnswer
, which indicates that a specified task cannot run at all due to insufficient funds. This object has the following fields:
need
: the funds needed to start a job (BigDecimal
)
have
: the funds at hand (BigDecimal
)
NoAnswer
, which indicates that an unexpected runtime error occurred.
The following is a ScalaDoc-generated signature:
The following constructor parameters are mandatory:
pattern
is a COBOL-style picture clause pattern that states what inputs are valid. AutoMan uses this pattern to perform probability calculations. A
matches an alphabetic character, B
matches an optional alphabetic character, X
matches an alphanumeric character, Y
matches an optional alphanumeric character, 9
matches a numeric character, and 0
matches an optional numeric character. For example, a telephone number recognition application might use the pattern 09999999999
.
text
is the text of the question shown in a HIT
and, by default, also as the task title. You can override the title by setting the title
parameter.
The following parameters are freetext
-specific:
allow_empty_pattern
means that the empty string is a valid worker response.
default: false
before_filter
is not currently used.
pattern_error_text
is a helpful message that is displayed to the user when their input does not match a pattern. It is not mandatory but it is highly recommended that you use this setting.
Free-text questions can return the following values:
Answer[String]
: An object that represents a response string. This object has the following fields:
value
: the answer (String
).
cost
: the total cost (BigDecimal
)
confidence
: the final confidence value (Double
).
distribution
: raw sample responses (Array[Response[String]]
)
LowConfidenceAnswer
, which has the same fields as Answer
but which indicates that a quality-controlled response has a confidence
lower than the desired threshold.
OverBudgetAnswer
, which indicates that a specified task cannot run at all due to insufficient funds. This object has the following fields:
need
: the funds needed to start a job (BigDecimal
)
have
: the funds at hand (BigDecimal
)
NoAnswer
, which indicates that an unexpected runtime error occurred.
The following is a ScalaDoc-generated signature:
The following constructor parameters are mandatory:
confidence_interval
lets you denote the confidence interval of an estimate. The options are:
text
is the text of the question shown in a HIT
and, by default, also as the task title. You can override the title by setting the title
parameter.
Estimates can return the following values:
Estimate
: An object that represents a "best estimate". This object has the following fields:
value
: the estiamte (Double
).
low
: the low bound of a confidence interval's estimate (Double
).
high
: the high bound of a confidence interval's estimate (Double
).
cost
: the total cost (BigDecimal
)
confidence
: the final confidence value (Double
).
distribution
: raw sample responses (Array[Response[Double]]
)
LowConfidenceEstimate
, which has the same fields as Estimate
but which indicates that a quality-controlled response has a confidence
lower than the desired threshold.
OverBudgetEstimate
, which indicates that a specified task cannot run at all due to insufficient funds. This object has the following fields:
need
: the funds needed to start a job (BigDecimal
)
have
: the funds at hand (BigDecimal
)
NoEstimate
, which indicates that an unexpected runtime error occurred.
The following is a ScalaDoc-generated signature:
We describe the checkboxes
constructor here, but freetexts
and radios
are similar. There is also a buggy multiestimate
constructor that should probably not be used at the moment.
The following constructor parameters are mandatory:
sample_size
is the size of the sample.
options
are the selection options seen by the user, along with optional images. Options can be created using one of the following choice
constructors:
choice(key: Symbol, text: String)
or
choice(key: Symbol, text: String, image_url: String)
where key
denotes a stable identifier for a choice (e.g., kermit
) not shown to the worker, text
is the text label shown to the worker, and image_url
is a url of an image shown beside the text label.
text
is the text of the question shown in a HIT
and, by default, also as the task title. You can override the title by setting the title
parameter.
The following is a ScalaDoc-generated signature:
The following are parameters common to all calls:
budget
is the total amount of money to be spent by a given human question function call. Note that this means that each function call has its own budget.
default: $5.00
dont_reject
, when set to true
, will always accept completed assignments and pay workers for their work. This is useful when work is difficult and errors are likely, or when you just don't want to deal with the hassle of reputation management.
default: false
dry_run
, when set to true
, will not actually post jobs on MTurk.
default: false
image_alt_text
adds an HTML ALT
annotation to the IMG
tag created by the image_url
parameter.
default: none (null
)
image_url
adds an image to a question. Such images should be hosted someplace publically-accessible, such as Amazon S3 or a personal website.
default: none (null
)
minimum_spawn_policy
states what the smallest number of assignments for a given HIT
are on MTurk. This is necessary because MTurk has two totally boneheaded policies:
HITs posted with 10 or fewer assignments are charged a 20% fee while HITs with more than 10 assignments are charged a 40% fee.
HITs with 10 or fewer assignments cannot be "extended" to have more assignments.
mock_answers
sets AutoMan to be used in mock mode for testing purposes. This is used interally by AutoMan for testing. You should not change this.
default: null
pay_all_on_failure
controls whether workers are paid when a task runs out of money. Setting this to false
means that workers will not be paid when an OverBudget
result is returned, which generally makes workers unhappy.
default: true
default: 500
a
is an initialized AutoMan platform adapter. Typically this will be an implicit
variable that you return from a platform initializer expression like mturk
. When marked implicit
, you do not need to pass the parameter yourself; Scala will find it in the environment and pass it, simplifying human function calls. AutoMan needs this information in order to bind a human function call to a given crowdsourcing platform.
We currently only support Amazon's Mechanical Turk. However, AutoMan was designed to accommodate arbitrary backends. If you are interested in seeing your crowdsourcing platform supported, please contact us.
AutoMan can be configured to save all intermediate human-computed results. Set the location of the database with database_path = "/path/to/your/database"
. The format of the database is H2.
Sample applications can be found in the apps
directory. Apps can also be built using pack
. E.g.,
Unix/DOS shell scripts for running the programs can then be found in apps/[the app]/target/pack/bin/
.
Observe how we use a Scala user-defined function (def
) to pass the imgUrl
parameter through to the estimate
constructor. See our for additional examples.
You are encouraged to look at the for examples.
We provide question constructors here. Note that all of them take a very large number of parameters, but that most of those parameters are the same between question types and nearly all of them have "sane defaults." Defaults are managed by requiring the use of .
There is (VoxPL) about this one question type.
UnconstrainedCI()
which will only even perform one round of tasks using the default sample size, returning the median.
SymmetricCI(err: Double)
which returns the median err
with confidence
level confidence.
AsymmetricCI(lerr: Double, herr: Double)
which returns the median of an estimate between -lerrr
and +herr
with confidence
level confidence.
initial_worker_timeout_in_s
is the amount of time permitted to a worker in the initial round of tasks. Note that the actual time permitted depends on the number of rounds and is determined by the quality control policy. The default policy uses the formula , where is the initial_worker_timeout_in_s
, , and is the round. In other words, task timeout are doubled.
default: 30 seconds
For now, what this means is that, if you do not change the default, AutoMan will post tasks with at least 10 assignments. If you anticipate that your tasks will likely need fewer than 10 assignments, you can set the anticipated amount by setting this to UserDefinableSpawnPolicy(n)
where n
is the number you want.
default: 10
note: I am actively unhappy about this and am thinking of ways to simplify it. .
question_timeout_multiplier
controls how much time a HIT exists on MTurk before it is timed out. Note that this is a distinct timeout from the amount of time a worker is given to complete a task, which is controlled by the initial_worker_timeout_in_s
parameter. A HIT's total time is determined by the formula , where is the initial_worker_timeout_in_s
, , is the round, and is the question_timeout_multiplier
.
wage
controls the base wage for a worker. The actual reward paid depends on how much time a worker is given to do a task. The default policy uses a maximum likelihood estimate of the probability that a task is accepted in order to compute a wage that disincentivzes wage gaming behavior. It is complicated enough that if you want to know its inner workings, you should . Generally you should think of the reward as "probably doubling."
default: the U.S. Federal Minimum wage, or $7.25/hour