Crowd sourced natural language programming

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Moderators General, Prelates

D-503
Posts: 75
Joined: Sun Apr 15, 2012 11:35 pm UTC

Crowd sourced natural language programming

Postby D-503 » Sat Sep 01, 2012 7:58 am UTC

I have an idea that I've been sitting on for a while, but tonight I'm going to share it with you all. My one line description of it is Wikipedia + Wolfram Alpha. I'm calling it Language. However, since that domain name is pretty much universally taken I might have to call it something else. I have some UI mockups here that will explain more:
Results for query: How does language work?

Results for query: I want to create a language node


So what do you all think? Seem cool? Is there someplace I should present it? Anyone want to help?

UPDATE: There is now a prototype running here: https://language-nathanathan.rhcloud.com/
Last edited by D-503 on Mon Nov 26, 2012 2:25 am UTC, edited 1 time in total.

User avatar
Jplus
Posts: 1711
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: Netherlands

Re: Crowd sourced natural language programming

Postby Jplus » Sat Sep 01, 2012 10:25 am UTC

Could you give a bit more explanation of what we're looking at?
"There are only two hard problems in computer science: cache coherence, naming things, and off-by-one errors." (Phil Karlton and Leon Bambrick)

coding and xkcd combined

(Julian/Julian's)

D-503
Posts: 75
Joined: Sun Apr 15, 2012 11:35 pm UTC

Re: Crowd sourced natural language programming

Postby D-503 » Sat Sep 01, 2012 11:47 pm UTC

I'll try giving a more concrete use case:
Say I want to plot a function. I could use Wolfram Alpha, but then I'm stuck with whatever plot widget they give me (and unless you pay them it's not a very good one). With the site I'm proposing, you would likely see several plotting widgets, and if one doesn't quite do what you need, you could fork it and create a better version. If other people like your widget more as well, they will vote it up so it appears higher up in the results like stack overflow.
Another thing you would be able to do is create specialized wigets that only match one or a small number of queries (e.g. "plot the cosmic background radiation"). Think of the Unix philosophy: Write programs that do one thing and do it well.
But, the purpose isn't just to make an open-souce variant of Wolfram Alpha. I think there is a lot of room for innovation in what kind of queries are possible. As a programmer, these are a couple examples that came to mind for me:
1. Code generation from queries like "create a html form with name age and gender fields"
2. When you get an esoteric error message you could register a widget for it that tells others exactly how you resolved it.

I might need your help figuring out exactly what needs explaining. Let me know if there is something I can elaborate on.

User avatar
Jplus
Posts: 1711
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: Netherlands

Re: Crowd sourced natural language programming

Postby Jplus » Sun Sep 02, 2012 6:41 pm UTC

Well, you've at least given me an idea of the what-level explanation (the possibilities). Perhaps you could elaborate more on the why and the how: why did you start this (what's the purpose) and how does it work?

Also, I don't really understand yet why it's called "natural language programming".
"There are only two hard problems in computer science: cache coherence, naming things, and off-by-one errors." (Phil Karlton and Leon Bambrick)

coding and xkcd combined

(Julian/Julian's)

User avatar
Yakk
Poster with most posts but no title.
Posts: 11083
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Crowd sourced natural language programming

Postby Yakk » Sun Sep 02, 2012 7:20 pm UTC

The input to the website is a natural language string ("create an html website with name and age fields").

The output is code.

In effect, the user programmed the output using natural language.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

User avatar
dudiobugtron
Posts: 1098
Joined: Mon Jul 30, 2012 9:14 am UTC
Location: The Outlier

Re: Crowd sourced natural language programming

Postby dudiobugtron » Wed Sep 05, 2012 12:21 am UTC

Having read Yakk's explanation, I can now say this is an awesome idea!

I'm a bit stumped on the up-voting system thing though. Surely a widget's suitability would depend on the exact query, not just on how much people liked it for any query. I think for it to work properly you'd need some sort of google-esque ranking algorithm, which (for eg) correlated people's searches with the code they eventually used, or something like that.

Too bad this doesn't already exist. If it did, you could ask it to code itself for you!
Image

D-503
Posts: 75
Joined: Sun Apr 15, 2012 11:35 pm UTC

Re: Crowd sourced natural language programming

Postby D-503 » Wed Sep 05, 2012 3:26 am UTC

dudiobugtron wrote:Having read Yakk's explanation, I can now say this is an awesome idea!

I'm a bit stumped on the up-voting system thing though. Surely a widget's suitability would depend on the exact query, not just on how much people liked it for any query. I think for it to work properly you'd need some sort of google-esque ranking algorithm, which (for eg) correlated people's searches with the code they eventually used, or something like that.

Thanks!

The reason I think a voting system where votes from all queries are combined will work is because it will incentivise making widgets that only parse queries they can handle and widgets that can handle as many queries as possible. For example, I could make a widget that parses every query, but since it would be completely irrelevant most of the time, it would receive a lot of down votes. Conversely, I could make a widget that only parses a single query "draw a square", but a widget that can also parse "draw a four sided shape" and "show square" and lots of other things for which an image of a square would suffice would be preferable.

That said, I'm sure a more complex voting system could produce better results.

Once point of tension in the ranking algorithm is the "do one thing and do it well" philosophy. A widget that specializes in drawing animals can probably do it better than a generalized drawing widget. So I want to give widgets that parse only a few queries a better ranking than they would otherwise receive, but I want to do this while leaving an incentive to parse more queries if possible.

I can think of some heuristics (e.g. If a widget that only matches a small number of queries and has a lot of down votes it's probably a poor quality widget, rather than one that appears in queries it wasn't intended for.), but I haven't yet managed to develop a fully thought out ranking algorithm. Here's the very simple ranking function which I'm planning to start with. I'm sure there are lots of ways to improve upon it if anyone wants to make an attempt at it.

rank = ((up votes) - (down votes)) / (queries parsed so far)

I'm not planning to attempt to do anything with user historys in the first iteration just because of the amount of complexity keeping track of users adds to the system.

dudiobugtron wrote:Too bad this doesn't already exist. If it did, you could ask it to code itself for you!

Once I make it, I'll make a "show me your source code" widget for that.

User avatar
dudiobugtron
Posts: 1098
Joined: Mon Jul 30, 2012 9:14 am UTC
Location: The Outlier

Re: Crowd sourced natural language programming

Postby dudiobugtron » Wed Sep 05, 2012 6:56 am UTC

Oh right that's clever - get the widgeteers to do the work! :)

I think it would work best if the website itself (rather than each individual widget writer, for each widget) did some of that work too, though. Otherwise there are going to be a lot of widgets which are going to get overlooked because someone used an extra word, or a mis-spelling, or a synonym, etc... etc...

Or, perhaps the task of matching a search query to a relevant widget should be something that is aided by intermediary widgets which are also 'crowd sourced' somehow?
Image

User avatar
Jplus
Posts: 1711
Joined: Wed Apr 21, 2010 12:29 pm UTC
Location: Netherlands

Re: Crowd sourced natural language programming

Postby Jplus » Wed Sep 05, 2012 7:34 am UTC

I don't want to ruin the game, but apart from Yakk's rather terse answer I haven't seen any account for the "how" yet, and the "why" is lacking altogether. Now I can guess that the "why" is just "because we can" (exploration), but the "how" still has some wide open ends.

I can see how it's crowd-sourced (obviously), and I see that natural language is used as the user interface. However I don't see yet why the title says "natural language programming". The widgets themselves aren't written in natural language, in fact I'm not sure yet whether they can have dynamic behaviour. Are the widgets supposed to use natural language queries as a part of their implementation, which will then be forwarded to other widgets? If the latter is the case, how does the system avoid the combinatorial explosions which are bound to arise because most queries will match more than one widget?
"There are only two hard problems in computer science: cache coherence, naming things, and off-by-one errors." (Phil Karlton and Leon Bambrick)

coding and xkcd combined

(Julian/Julian's)

User avatar
Yakk
Poster with most posts but no title.
Posts: 11083
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Crowd sourced natural language programming

Postby Yakk » Wed Sep 05, 2012 10:46 am UTC

The fact that a compiler outputs ASM doesn't mean that the person programmed in ASM.

It you have an engine that takes a natural language string and outputs machine executable information is all that is required. Well and that the output bears a resemblance possibly to what was asked, heh.

I think what is described wo old be pretty hard. Of course son is wolfram alpha, but it got implemented.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

D-503
Posts: 75
Joined: Sun Apr 15, 2012 11:35 pm UTC

Re: Crowd sourced natural language programming

Postby D-503 » Wed Sep 05, 2012 4:00 pm UTC

Jplus wrote:I don't want to ruin the game, but apart from Yakk's rather terse answer I haven't seen any account for the "how" yet, and the "why" is lacking altogether. Now I can guess that the "why" is just "because we can" (exploration), but the "how" still has some wide open ends.

I can see how it's crowd-sourced (obviously), and I see that natural language is used as the user interface. However I don't see yet why the title says "natural language programming". The widgets themselves aren't written in natural language, in fact I'm not sure yet whether they can have dynamic behaviour. Are the widgets supposed to use natural language queries as a part of their implementation, which will then be forwarded to other widgets? If the latter is the case, how does the system avoid the combinatorial explosions which are bound to arise because most queries will match more than one widget?


Unfortunately, I don't have much time to write up a detailed reply at the moment, so this might be a bit terse.

Widgets can have dynamic behavior at runtime because they are HTML and javascript.

Additionally, from my first link on "how does language work?"
The widget rendering code should have access to the parse tree. That way there can be a generalized graph plotting widget for instance. Child nodes in the parse tree may have additional functions and data attached for use by their parent's rendering routine.


By that mechanism it is possible to have dynamic behavior where the widget changes depending on the query it parsed.

The time complexity is a problem, but at this point I'm not sure how big of problem.

D-503
Posts: 75
Joined: Sun Apr 15, 2012 11:35 pm UTC

Re: Crowd sourced natural language programming

Postby D-503 » Mon Nov 26, 2012 2:24 am UTC

Hey everyone, I have a prototype ready!
I'm hoping some of you will be kind enough to be my alpha testers and make some Language nodes.
If you need inspiration here's a few ideas:
  • Take an API for a website you like, and come up with ways to interact through with it through Language queries. (e.g. "____'s github repositories.")
  • Plot datasets by mashing up YUI queries with d3 visualizations. ("gini index of United States since 1980")
  • Coding help/code generation. ("show node.js proxy code", "python transpose matrix function", "escape this string: _____", "generate html form with the following fields...")

To give an idea of how to do create widgets, I've implemented some examples:

Here's a basic "paint a picture" widget. (The actual painting code isn't mine.)

This widget does some data plotting with YQL and D3:
"population of Canada since 1970"

This widget is an interpreter for a lisp dialect I'm working on
"(+ 1 2 (+ 3 4 5))"
Unfortunately, sums are the only thing it can do for now. It's easy to add new primitive functions, for example this is the language node for +, but I'm struggling with lambas. The interpreter requests the query's parse tree from the server. However, that parse tree isn't a normal parse tree, the grammar is ambiguous, each node has an array of interpretations, so I'm calling it a multi-parse tree. This allows some interesting behavior in the interpreter, programs can return multiple values for every way they can be interpreted without causing an error.

Finally, here's a "show source" widget for the site.

One thing I'm interested in is seeing how standalone queries like "population of ____ since _____", can be mashed up with interpreted code queries.

screen317
Posts: 252
Joined: Mon May 16, 2011 7:46 pm UTC

Re: Crowd sourced natural language programming

Postby screen317 » Fri Nov 30, 2012 8:24 pm UTC

All of your GitHub links are dead:

"
Page did not respond in a timely fashion."

D-503
Posts: 75
Joined: Sun Apr 15, 2012 11:35 pm UTC

Re: Crowd sourced natural language programming

Postby D-503 » Sat Dec 01, 2012 4:01 am UTC

I'm sure which github links you're referring to. None of the github links I've tried is dead. Could you post one of the dead links?

EvanED
Posts: 4330
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI
Contact:

Re: Crowd sourced natural language programming

Postby EvanED » Sat Dec 01, 2012 4:05 am UTC

"Page did not respond in a timely fashion" sounds like either a local problem or a temporary outage on Github's part, not a link rot problem.

lorb
Posts: 405
Joined: Wed Nov 10, 2010 10:34 am UTC
Location: Austria

Re: Crowd sourced natural language programming

Postby lorb » Thu Dec 27, 2012 3:11 pm UTC

Input: "what is the value of pi?"
Output: "Sorry, your query could not be interpreted."

Input: "a random number"
Output: "Sorry, your query could not be interpreted."

... and so on
Please be gracious in judging my english. (I am not a native speaker/writer.)
http://decodedarfur.org/


Return to “Computer Science”

Who is online

Users browsing this forum: No registered users and 1 guest