Skip to content

Commit

Permalink
Support ambiguous parameteric datatype constructors (#106)
Browse files Browse the repository at this point in the history
This PR ensures ethos can properly parse and type check ambiguous
datatype constructors in the SMT-LIB 2.X standard.

Say a datatype constructor is "ambiguous" if it of type:
```
(-> T1 ... Tn T)
```
where some free parameter of T is not contained in the free parameters
of T1...Tn. (Note n may be 0).

To support ambiguous datatype constructors in Eunoia/Ethos, we require
several things.

**(1) Their declared type is made unambiguous.**

In particular, all ambiguous datatype constructors are instead defined
to be of type:
```
(-> (Quote T) T1 ... Tn T)
```
This is done automatically, so that for:
```
(declare-datatypes ((List 1)) ((par (X)((nil) (cons (head X) (tail (List X)))))))
```
If the user asked for the types of the constructors, they would get:
```
(eo::typeof nil) == (-> (Quote (List X)) (List X))
(eo::typeof cons) == (-> X (List X) (List X))
```

**(2) Type annotations on datatype constructors are treated as opaque
applications.**

We expect `(as nil (List Int))` to be a constant of type `(List Int)`.

This is implemented by parsing `(as nil (List Int))` as `(_ nil (List
Int))`, where this is an "opaque" application.

Effectively, this means that `(_ nil (List Int))` is a constant symbol
that is indexed by its type.

Note in this understanding, Ethos is overloading the meaning of the `as`
operator in SMT-LIB for two purposes. First, for overloaded symbols
which is a way of determining which symbol is being referred to, and for
parameteric datatypes where it is interpreted as being an (opaque)
argument. Only the latter is implemented currently.

**(3) Extend the semantics of eo::dt_selectors.**

Currently, `(eo::dt_selectors c)` evaluates the list of selectors only
if c is a constructor constant.

To handle ambiguous datatype constructors, we also say that
`(eo::dt_selectors (_ c T))` evaluates to the list of selectors of `c`.

Example:
```
(eo::dt_selectors (as nil (List Int))) == eo::List::nil
```

**(4) Extend the semantics of eo::dt_constructors.**

Currently, `(eo::dt_constructors T)` evaluates the list of constructors
only if `T` is an (uninstantiated) datatype type.

We extend this for instantiated datatypes to automatically apply the
appropriate annotations. For example,
```
(eo::dt_constructors List) == (eo::List::cons nil (eo::List::cons cons eo::List::nil))
(eo::dt_constructors (List T)) == (eo::List::cons (as nil (List T)) (eo::List::cons cons eo::List::nil))
```

**(5) The SMT-LIB syntax `as` is now parsed in Ethos (in all
contexts).**

Note this only is processed when applied to ambiguous datatype
constructors. Processing `as` for symbol disambiguation is still TODO.

---------

Co-authored-by: Hans-Jörg <[email protected]>
  • Loading branch information
ajreynol and hansjoergschurr authored Feb 4, 2025
1 parent f77aba5 commit 0d49e89
Show file tree
Hide file tree
Showing 17 changed files with 353 additions and 29 deletions.
7 changes: 7 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
This file contains a summary of important user-visible changes.

ethos 0.1.2 prerelease
======================

- Adds support for the SMT-LIB `as` annotations for ambiguous datatype constructors, i.e. those whose return type cannot be inferred from its argument types. Following SMT-LIB convention, ambiguous datatype constructors are expected to be annotated with their *return* type using `as`, which internally is treated as a type argument to the constructor.
- The semantics for `eo::dt_constructors` is extended for instantiated parametric datatypes. For example calling `eo::dt_constructors` on `(List Int)` returns the list containing `cons` and `(as nil (List Int))`.
- The semantics for `eo::dt_selectors` is extended for annotated constructors. For example calling `eo::dt_selectors` on `(as nil (List Int))` returns the empty list.

ethos 0.1.1
===========

Expand Down
1 change: 1 addition & 0 deletions src/attr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ std::ostream& operator<<(std::ostream& o, Attr a)
case Attr::DATATYPE: o << "DATATYPE"; break;
case Attr::CODATATYPE: o << "CODATATYPE"; break;
case Attr::DATATYPE_CONSTRUCTOR: o << "DATATYPE_CONSTRUCTOR"; break;
case Attr::AMB_DATATYPE_CONSTRUCTOR: o << "AMB_DATATYPE_CONSTRUCTOR"; break;
default: o << "UnknownAttr(" << unsigned(a) << ")"; break;
}
return o;
Expand Down
7 changes: 4 additions & 3 deletions src/attr.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ namespace ethos {
enum class Attr
{
NONE = 0,

VAR,
IMPLICIT,
REQUIRES,
Expand All @@ -34,11 +34,11 @@ enum class Attr
BINDER,
LET_BINDER,
OPAQUE,

// smt3 things that are not strictly supported
SYNTAX,
RESTRICT,

// indicate how to construct proof rule steps
PREMISE_LIST,

Expand All @@ -53,6 +53,7 @@ enum class Attr
// datatypes
DATATYPE,
DATATYPE_CONSTRUCTOR,
AMB_DATATYPE_CONSTRUCTOR, // constructors requiring an opaque type argument
CODATATYPE
};

Expand Down
9 changes: 7 additions & 2 deletions src/cmd_parser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,7 @@ bool CmdParser::parseNextCommand()
std::vector<size_t> arities;
std::map<const ExprValue*, std::vector<Expr>> dts;
std::map<const ExprValue*, std::vector<Expr>> dtcons;
std::unordered_set<const ExprValue*> ambCons;
if (isMulti)
{
d_lex.eatToken(Token::LPAREN);
Expand All @@ -268,7 +269,7 @@ bool CmdParser::parseNextCommand()
std::string name = d_eparser.parseSymbol();
dnames.push_back(name);
}
if (!d_eparser.parseDatatypesDef(dnames, arities, dts, dtcons))
if (!d_eparser.parseDatatypesDef(dnames, arities, dts, dtcons, ambCons))
{
d_lex.parseError("Failed to bind symbols for datatype definition");
}
Expand All @@ -282,9 +283,13 @@ bool CmdParser::parseNextCommand()
}
for (std::pair<const ExprValue* const, std::vector<Expr>>& c : dtcons)
{
// may be ambiguous
Attr ac = ambCons.find(c.first) != ambCons.end()
? Attr::AMB_DATATYPE_CONSTRUCTOR
: Attr::DATATYPE_CONSTRUCTOR;
Expr cons = Expr(c.first);
Expr stuple = d_state.mkList(c.second);
d_state.markConstructorKind(cons, Attr::DATATYPE_CONSTRUCTOR, stuple);
d_state.markConstructorKind(cons, ac, stuple);
}
if (isMulti)
{
Expand Down
56 changes: 46 additions & 10 deletions src/expr_parser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -867,7 +867,8 @@ bool ExprParser::parseDatatypesDef(
const std::vector<std::string>& dnames,
const std::vector<size_t>& arities,
std::map<const ExprValue*, std::vector<Expr>>& dts,
std::map<const ExprValue*, std::vector<Expr>>& dtcons)
std::map<const ExprValue*, std::vector<Expr>>& dtcons,
std::unordered_set<const ExprValue*>& ambCons)
{
Assert(dnames.size() == arities.size()
|| (dnames.size() == 1 && arities.empty()));
Expand Down Expand Up @@ -962,7 +963,8 @@ bool ExprParser::parseDatatypesDef(
dti = d_state.mkExpr(Kind::APPLY, dapp);
}
std::vector<std::pair<std::string, Expr>> toBind;
parseConstructorDefinitionList(dti, dts[dt.getValue()], dtcons, toBind);
std::vector<Expr>& clist = dts[dt.getValue()];
parseConstructorDefinitionList(dti, clist, dtcons, toBind, ambCons, params);
if (pushedScope)
{
d_lex.eatToken(Token::RPAREN);
Expand Down Expand Up @@ -990,7 +992,9 @@ void ExprParser::parseConstructorDefinitionList(
Expr& dt,
std::vector<Expr>& conslist,
std::map<const ExprValue*, std::vector<Expr>>& dtcons,
std::vector<std::pair<std::string, Expr>>& toBind)
std::vector<std::pair<std::string, Expr>>& toBind,
std::unordered_set<const ExprValue*>& ambCons,
const std::vector<Expr>& params)
{
d_lex.eatToken(Token::LPAREN);
Expr boolType = d_state.mkBoolType();
Expand All @@ -1017,6 +1021,24 @@ void ExprParser::parseConstructorDefinitionList(
toBind.emplace_back(ss.str(), updater);
d_lex.eatToken(Token::RPAREN);
}
bool isAmb = false;
if (!params.empty())
{
// if this is an ambiguous datatype constructor, we add (Quote T)
// as the first argument type.
Expr tup = d_state.mkExpr(Kind::TUPLE, typelist);
std::vector<Expr> pargs = Expr::getVariables(tup);
Expr fv = findFreeVar(dt, pargs);
Trace("param-dt") << "Parameteric datatype constructor: " << name;
Trace("param-dt") << (fv.isNull() ? " un" : " ") << "ambiguous"
<< std::endl;
if (!fv.isNull())
{
Expr odt = d_state.mkQuoteType(dt);
typelist.insert(typelist.begin(), odt);
isAmb = true;
}
}
Expr ctype = d_state.mkFunctionType(typelist, dt);
Expr cons = d_state.mkSymbol(Kind::CONST, name, ctype);
toBind.emplace_back(name, cons);
Expand All @@ -1028,6 +1050,10 @@ void ExprParser::parseConstructorDefinitionList(
Expr tester = d_state.mkSymbol(Kind::CONST, ss.str(), dtype);
toBind.emplace_back(ss.str(), tester);
dtcons[cons.getValue()] = sels;
if (isAmb)
{
ambCons.insert(cons.getValue());
}
}
}

Expand Down Expand Up @@ -1402,7 +1428,7 @@ Expr ExprParser::typeCheck(Expr& e, const Expr& expected)
return et;
}

void ExprParser::ensureBound(const Expr& e, const std::vector<Expr>& bvs)
Expr ExprParser::findFreeVar(const Expr& e, const std::vector<Expr>& bvs)
{
std::vector<Expr> efv = Expr::getVariables(e);
for (const Expr& v : efv)
Expand All @@ -1414,14 +1440,24 @@ void ExprParser::ensureBound(const Expr& e, const std::vector<Expr>& bvs)
{
continue;
}
std::stringstream msg;
msg << "Unexpected free parameter in expression:" << std::endl;
msg << " Expression: " << e << std::endl;
msg << " Free parameter: " << v << std::endl;
msg << "Bound parameters: " << bvs << std::endl;
d_lex.parseError(msg.str());
return v;
}
}
return d_null;
}

void ExprParser::ensureBound(const Expr& e, const std::vector<Expr>& bvs)
{
Expr v = findFreeVar(e, bvs);
if (!v.isNull())
{
std::stringstream msg;
msg << "Unexpected free parameter in expression:" << std::endl;
msg << " Expression: " << e << std::endl;
msg << " Free parameter: " << v << std::endl;
msg << "Bound parameters: " << bvs << std::endl;
d_lex.parseError(msg.str());
}
}

void ExprParser::processAttributeMap(const AttrMap& attrs,
Expand Down
30 changes: 27 additions & 3 deletions src/expr_parser.h
Original file line number Diff line number Diff line change
Expand Up @@ -85,11 +85,20 @@ class ExprParser
* datatype_dec :=
* (<constructor_dec>+) | (par (<symbol>+) (<constructor_dec>+))
* constructor_dec := (<symbol> (<symbol> <sort>)∗)
*
* @param dnames The names of the datatypes.
* @param arities The arities of the datatypes given by the names dnames.
* @param dts Mapping from datatypes to their constructor symbols.
* @param dtcons Mapping from constructors to their selector symbols.
* @param ambCons The subset of constructors in the domain of dtcons that are
* ambiguous constructors, i.e. require their return type as the first
* argument.
*/
bool parseDatatypesDef(const std::vector<std::string>& dnames,
const std::vector<size_t>& arities,
std::map<const ExprValue*, std::vector<Expr>>& dts,
std::map<const ExprValue*, std::vector<Expr>>& dtcons);
std::map<const ExprValue*, std::vector<Expr>>& dtcons,
std::unordered_set<const ExprValue*>& ambCons);
/**
* Parses ':X', returns 'X'
*/
Expand Down Expand Up @@ -139,7 +148,14 @@ class ExprParser
Expr getProofRule(const std::string& name);
/** Bind, or throw error otherwise */
void bind(const std::string& name, Expr& e);
/** Ensure bound */
/**
* @return a variable from the free variables of e that is not in bvs if
* one exists, or the null expression otherwise.
*/
Expr findFreeVar(const Expr& e, const std::vector<Expr>& bvs);
/**
* Throw an exception if the free variables of e are not in bvs.
*/
void ensureBound(const Expr& e, const std::vector<Expr>& bvs);
//-------------------------- end checking
/**
Expand All @@ -159,12 +175,20 @@ class ExprParser
/**
* Parse constructor definition list, add to declaration type. The expected
* syntax is '(<constructor_dec>+)'.
* @param dt The datatype this constructor list is for.
* @param conslist Populated with the constructors of dt.
* @param dtcons Mapping from constructors to their selector symbols.
* @param toBind The symbols to bind.
* @param ambCons The subset of conslist that are ambiguous constructors.
* @param param The parameters of dt.
*/
void parseConstructorDefinitionList(
Expr& dt,
std::vector<Expr>& conslist,
std::map<const ExprValue*, std::vector<Expr>>& dtcons,
std::vector<std::pair<std::string, Expr>>& toBind);
std::vector<std::pair<std::string, Expr>>& toBind,
std::unordered_set<const ExprValue*>& ambCons,
const std::vector<Expr>& params);
/** Return the unsigned for the current token string. */
uint32_t tokenStrToUnsigned();
/**
Expand Down
2 changes: 2 additions & 0 deletions src/kind.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ std::ostream& operator<<(std::ostream& o, Kind k)
case Kind::TUPLE: o << "TUPLE"; break;
case Kind::PROGRAM: o << "PROGRAM"; break;
case Kind::AS: o << "AS"; break;
case Kind::AS_RETURN: o << "AS_RETURN"; break;
case Kind::PARAMETERIZED: o << "PARAMETERIZED"; break;
case Kind::APPLY_OPAQUE: o << "APPLY_OPAQUE"; break;
// literals
Expand Down Expand Up @@ -118,6 +119,7 @@ std::string kindToTerm(Kind k)
case Kind::LAMBDA: ss << "lambda"; break;
case Kind::PROGRAM: ss << "program"; break;
case Kind::AS: ss << "eo::as"; break;
case Kind::AS_RETURN: ss << "as"; break;
case Kind::PARAMETERIZED: ss << "eo::_"; break;
// operations on literals
default:
Expand Down
3 changes: 2 additions & 1 deletion src/kind.h
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,8 @@ enum class Kind
LAMBDA,
TUPLE,
PROGRAM,
AS,
AS, // (eo::as t T), where T is the type of t
AS_RETURN, // SMT-LIB (as t T), where T is the return type of t
PARAMETERIZED,
APPLY_OPAQUE,

Expand Down
20 changes: 18 additions & 2 deletions src/state.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,7 @@ State::State(Options& opts, Stats& stats)
bindBuiltinEval("dt_selectors", Kind::EVAL_DT_SELECTORS);

// as
bindBuiltin("as", Kind::AS_RETURN);
bindBuiltinEval("as", Kind::AS);

// note we don't allow parsing (Proof ...), (Quote ...), or (quote ...).
Expand Down Expand Up @@ -564,6 +565,9 @@ Expr State::mkBoolType()

Expr State::mkListType() { return d_listType; }

Expr State::mkListCons() { return d_listCons; }
Expr State::mkListNil() { return d_listNil; }

Expr State::mkProofType(const Expr& proven)
{
return Expr(mkExprInternal(Kind::PROOF_TYPE, {proven.getValue()}));
Expand Down Expand Up @@ -912,10 +916,22 @@ Expr State::mkExpr(Kind k, const std::vector<Expr>& children)
<< ", " << children.size() << " arguments" << std::endl;
}
}
else if (k == Kind::AS_RETURN)
{
// (as nil (List Int)) --> (_ nil (List Int))
if (getConstructorKind(vchildren[0]) == Attr::AMB_DATATYPE_CONSTRUCTOR
&& children.size() == 2)
{
Trace("overload") << "...type arg for ambiguous constructor" << std::endl;
return mkExpr(Kind::APPLY_OPAQUE, {children[0], children[1]});
}
// note we don't support other uses of `as` for symbol disambiguation yet,
// we fallthrough and construct the bogus term of kind AS_RETURN.
}
else if (k==Kind::AS)
{
// if it has 2 children, process it, otherwise we make the bogus term
// below
// if it has 2 children, process it, otherwise we make the bogus term of
// kind AS below
if (vchildren.size()==2)
{
Trace("overload") << "process eo::as " << children[0] << " " << children[1] << std::endl;
Expand Down
4 changes: 4 additions & 0 deletions src/state.h
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,10 @@ class State
Expr mkBoolType();
/** eo::List */
Expr mkListType();
/** eo::List::cons */
Expr mkListCons();
/** eo::List::nil */
Expr mkListNil();
/** (Proof <proven>) */
Expr mkProofType(const Expr& proven);
/** (Quote <term>) */
Expand Down
Loading

0 comments on commit 0d49e89

Please sign in to comment.