digitalmars.D - Bug Prediction at Google

Robert Clipsham (15/15) Dec 15 2011 I just read this pretty interesting article on the Google Engineering

Andrew Wiley (5/16) Dec 15 2011 Well, Github does have an API to allow that sort of thing to happen.
Brad Roberts (57/66) Dec 15 2011 I'm 90% done with adding pull testing to the existing auto-tester fleet....
Brad Anderson (2/71) Dec 16 2011 Out of curiosity, how much is a little?
Brad Roberts (14/74) Dec 16 2011 very ugly but minimally functional: http://d.puremagic.com/test-results/...

Robert Clipsham (12/24) Dec 17 2011 Idea: I noticed most pull requests were failing when I looked at it, due...

Brad Roberts (6/32) Dec 17 2011 Yeah. I know I need to do something in that space and just haven't yet....
Martin Nowak (7/47) Dec 19 2011 Another optimization idea. Put pull request that fail to merge on an
Brad Roberts (8/49) Dec 19 2011 Way ahead of you, but low priority on it from a throughput standpoint. ...

Brad Anderson (4/105) Dec 16 2011 That seems very reasonable. It sounds like this autotester will help

Robert Clipsham <robert octarineparrot.com> writes:

I just read this pretty interesting article on the Google Engineering 
Tools website, thought it might interest some people here:

http://google-engtools.blogspot.com/2011/12/bug-prediction-at-google.html

( http://goo.gl/2O6YT <= a short link in case the above one gets wrapped)

It basically describes a new process in place at Google whereby each 
file within a project is assigned a rating saying how likely the given 
file is to have a bug in it compared to everything else. This can be 
used by code reviewers so they can take extra care when reviewing 
certain changes.

I really think github needs built in review tools (more advanced than 
just pull requests) to allow things like the auto-tester to be run, or 
algorithms like this to be used for manual review and so on.

-- 
Robert
http://octarineparrot.com/

Dec 15 2011

Andrew Wiley <wiley.andrew.j gmail.com> writes:

On Thu, Dec 15, 2011 at 6:04 PM, Robert Clipsham
<robert octarineparrot.com> wrote:
 I just read this pretty interesting article on the Google Engineering Tools
 website, thought it might interest some people here:

 http://google-engtools.blogspot.com/2011/12/bug-prediction-at-google.html

 ( http://goo.gl/2O6YT <= a short link in case the above one gets wrapped)

 It basically describes a new process in place at Google whereby each file
 within a project is assigned a rating saying how likely the given file is to
 have a bug in it compared to everything else. This can be used by code
 reviewers so they can take extra care when reviewing certain changes.

 I really think github needs built in review tools (more advanced than just
 pull requests) to allow things like the auto-tester to be run, or algorithms
 like this to be used for manual review and so on.

Well, Github does have an API to allow that sort of thing to happen.
In theory, a bot could examine a pull request, merge it and run tests,
and post back the results.

Dec 15 2011

Brad Roberts <braddr puremagic.com> writes:

On Thu, 15 Dec 2011, Andrew Wiley wrote:

 On Thu, Dec 15, 2011 at 6:04 PM, Robert Clipsham
 <robert octarineparrot.com> wrote:
 I really think github needs built in review tools (more advanced than just
 pull requests) to allow things like the auto-tester to be run, or algorithms
 like this to be used for manual review and so on.

 
 Well, Github does have an API to allow that sort of thing to happen.
 In theory, a bot could examine a pull request, merge it and run tests,
 and post back the results.

I'm 90% done with adding pull testing to the existing auto-tester fleet.  
It's based on the work that Daniel did.

The basic overview:

server:

  every 10 minutes, check github for changes to pull requests
    (if they had notifications for pull changes, I'd use it instead)

  every time a trunk commit notification is received, check github for
    changes to pull requests


client:

  forever {
    check for trunk build
    if yes { build trunk; continue }
    check for pull build
    if yes { build pull; continue }
    sleep 1m
  }


Left to do:

  1) deploy changes to the tester hosts (it's on 2 already)

  2) finish the ui

  3) trigger pull rebuilds when trunk is updated

  4) add back in support for related pull requests (ie, two pulls that
     separately fail but together succeed)

  5) consider adding updating the pull request on github with tester 
     results.  This one needs to be very carefully done to not spam the 
     report every time a build fails or succeeds.

  6) update the auto-tester grease monkey script to integrate the pull 
     tester results with github's ui.

I'll hopefully finish 1 and 2 tonight.  I can do 3 manually until it's 
automated.  I'm not sure about the ordering of 4-6.  They're nice to haves 
rather than must haves.

All these extra builds are going to cost a lot of time.  There's about 100 
open pull requests right now.  The fastest runs are on the order of 10 
minutes.  That's 6 per hour or roughly 17 hours.  The slowest are closer 
to an hour.  So, obviously there's some growing pains to deal with.  I'll 
probably add a way for github committers to prioritize pull requests so 
they build first.

Luckily this stuff is trivial to throw hardware at.. it's super 
parallelizeable.  Also the hardware I have for those long runs is super 
old.  I think the freebsd/32 box is p4 era box.  The win/32 is my asus eee 
atom based netbook.

If anyone wants to volunteer build hardware, particularly for the 
non-linux platforms, please contact me via email (let's not clutter up the 
newsgroup with that chatter).

Requirements:
  I need to be able to access it remotely.

  It needs to be have reliable connectivity (bandwidth is pretty much a 
  non-issue.. it doesn't need much at all).

  It needs to be hardware you're willing to have hammered fairly hard at 
  random times.

I'll almost certainly write some code during the holiday to fire up and 
use ec2 nodes for windows and linux builds.  With the application of just 
a little money, all of those runs could be done fully parallel.. which 
would just be sweet to see.  Ok, I admit it.. I love working at amazon on 
ec2.. and I'm happy to finally have a project that could actually use it.

Later,
Brad

Dec 15 2011

Brad Anderson <eco gnuk.net> writes:

On Thu, Dec 15, 2011 at 6:43 PM, Brad Roberts <braddr puremagic.com> wrote:

 On Thu, 15 Dec 2011, Andrew Wiley wrote:

 On Thu, Dec 15, 2011 at 6:04 PM, Robert Clipsham
 <robert octarineparrot.com> wrote:
 I really think github needs built in review tools (more advanced than


 just
 pull requests) to allow things like the auto-tester to be run, or


 algorithms
 like this to be used for manual review and so on.

 Well, Github does have an API to allow that sort of thing to happen.
 In theory, a bot could examine a pull request, merge it and run tests,
 and post back the results.

 I'm 90% done with adding pull testing to the existing auto-tester fleet.
 It's based on the work that Daniel did.

 The basic overview:

 server:

  every 10 minutes, check github for changes to pull requests
    (if they had notifications for pull changes, I'd use it instead)

  every time a trunk commit notification is received, check github for
    changes to pull requests


 client:

  forever {
    check for trunk build
    if yes { build trunk; continue }
    check for pull build
    if yes { build pull; continue }
    sleep 1m
  }


 Left to do:

  1) deploy changes to the tester hosts (it's on 2 already)

  2) finish the ui

  3) trigger pull rebuilds when trunk is updated

  4) add back in support for related pull requests (ie, two pulls that
     separately fail but together succeed)

  5) consider adding updating the pull request on github with tester
     results.  This one needs to be very carefully done to not spam the
     report every time a build fails or succeeds.

  6) update the auto-tester grease monkey script to integrate the pull
     tester results with github's ui.

 I'll hopefully finish 1 and 2 tonight.  I can do 3 manually until it's
 automated.  I'm not sure about the ordering of 4-6.  They're nice to haves
 rather than must haves.

 All these extra builds are going to cost a lot of time.  There's about 100
 open pull requests right now.  The fastest runs are on the order of 10
 minutes.  That's 6 per hour or roughly 17 hours.  The slowest are closer
 to an hour.  So, obviously there's some growing pains to deal with.  I'll
 probably add a way for github committers to prioritize pull requests so
 they build first.

 Luckily this stuff is trivial to throw hardware at.. it's super
 parallelizeable.  Also the hardware I have for those long runs is super
 old.  I think the freebsd/32 box is p4 era box.  The win/32 is my asus eee
 atom based netbook.

 If anyone wants to volunteer build hardware, particularly for the
 non-linux platforms, please contact me via email (let's not clutter up the
 newsgroup with that chatter).

 Requirements:
  I need to be able to access it remotely.

  It needs to be have reliable connectivity (bandwidth is pretty much a
  non-issue.. it doesn't need much at all).

  It needs to be hardware you're willing to have hammered fairly hard at
  random times.

 I'll almost certainly write some code during the holiday to fire up and
 use ec2 nodes for windows and linux builds.  With the application of just
 a little money, all of those runs could be done fully parallel.. which
 would just be sweet to see.  Ok, I admit it.. I love working at amazon on
 ec2.. and I'm happy to finally have a project that could actually use it.

 Later,
 Brad

 With the application of just a little money

Out of curiosity, how much is a little?

Dec 16 2011

Brad Roberts <braddr puremagic.com> writes:

On 12/16/2011 1:29 PM, Brad Anderson wrote:
 On Thu, Dec 15, 2011 at 6:43 PM, Brad Roberts <braddr puremagic.com
<mailto:braddr puremagic.com>> wrote:
 
 
     Left to do:
 
      1) deploy changes to the tester hosts (it's on 2 already)

done

      2) finish the ui

very ugly but minimally functional:
http://d.puremagic.com/test-results/pulls.ghtml

      3) trigger pull rebuilds when trunk is updated

partly implemented, but not being done yet

      4) add back in support for related pull requests (ie, two pulls that
         separately fail but together succeed)
 
      5) consider adding updating the pull request on github with tester
         results.  This one needs to be very carefully done to not spam the
         report every time a build fails or succeeds.
 
      6) update the auto-tester grease monkey script to integrate the pull
         tester results with github's ui.
 
     I'll hopefully finish 1 and 2 tonight.  I can do 3 manually until it's
     automated.  I'm not sure about the ordering of 4-6.  They're nice to haves
     rather than must haves.
 
     All these extra builds are going to cost a lot of time.  There's about 100
     open pull requests right now.  The fastest runs are on the order of 10
     minutes.  That's 6 per hour or roughly 17 hours.  The slowest are closer
     to an hour.  So, obviously there's some growing pains to deal with.  I'll
     probably add a way for github committers to prioritize pull requests so
     they build first.
 
     Luckily this stuff is trivial to throw hardware at.. it's super
     parallelizeable.  Also the hardware I have for those long runs is super
     old.  I think the freebsd/32 box is p4 era box.  The win/32 is my asus eee
     atom based netbook.
 
     If anyone wants to volunteer build hardware, particularly for the
     non-linux platforms, please contact me via email (let's not clutter up the
     newsgroup with that chatter).
 
     Requirements:
      I need to be able to access it remotely.
 
      It needs to be have reliable connectivity (bandwidth is pretty much a
      non-issue.. it doesn't need much at all).
 
      It needs to be hardware you're willing to have hammered fairly hard at
      random times.
 
     I'll almost certainly write some code during the holiday to fire up and
     use ec2 nodes for windows and linux builds.  With the application of just
     a little money, all of those runs could be done fully parallel.. which
     would just be sweet to see.  Ok, I admit it.. I love working at amazon on
     ec2.. and I'm happy to finally have a project that could actually use it.
 
     Later,
     Brad
 
 
 With the application of just a little money

 
 Out of curiosity, how much is a little?

I'll need to experiment.  It's the kind of thing that the more money thrown at
the problem the faster the builds could
be churned through.  The limit of that would be one box per platform per pull
request.  A silly (and not possible due to
ec2 not supporting some platforms, such as osx) extreme though.  One c1.medium
running 24x7 at the current spot price is
about $29/month.  Spot is perfect because this is a totally interruptable /
resumable process.  I'll need to do a little
work to make it resume an in-flight test run, but it's not hard at all to do.

After watching the build's slowly churn through the first round of build
requests, it's clear to me that one of the
areas I'll need to invest a good bit more time is in scheduling of pulls. 
Right now it's pseudo-random (an artifact of
how the data is stored in a hash table).

Later,
Brad

Dec 16 2011

Robert Clipsham <robert octarineparrot.com> writes:

On 17/12/2011 06:40, Brad Roberts wrote:
 On 12/16/2011 1:29 PM, Brad Anderson wrote:
 On Thu, Dec 15, 2011 at 6:43 PM, Brad
Roberts<braddr puremagic.com<mailto:braddr puremagic.com>>  wrote:


      Left to do:

       1) deploy changes to the tester hosts (it's on 2 already)

 done

       2) finish the ui

 very ugly but minimally functional:
http://d.puremagic.com/test-results/pulls.ghtml

       3) trigger pull rebuilds when trunk is updated

 partly implemented, but not being done yet

Idea: I noticed most pull requests were failing when I looked at it, due 
to the main build failing - that's a lot of wasted computing time. 
Perhaps it would be a good idea to refuse to test pulls if dmd HEAD 
isn't compiling? This would be problematic for the 1/100 pull requests 
designed to fix this, but would save a lot of testing.

An alternative method could be to test all of them, but if the pull 
request previously passed, then dmd HEAD broke, then the pull broke, 
stop testing until dmd HEAD is fixed.

-- 
Robert
http://octarineparrot.com/

Dec 17 2011

Brad Roberts <braddr puremagic.com> writes:

On 12/17/2011 4:56 AM, Robert Clipsham wrote:
 On 17/12/2011 06:40, Brad Roberts wrote:
 On 12/16/2011 1:29 PM, Brad Anderson wrote:
 On Thu, Dec 15, 2011 at 6:43 PM, Brad
Roberts<braddr puremagic.com<mailto:braddr puremagic.com>>  wrote:


      Left to do:

       1) deploy changes to the tester hosts (it's on 2 already)

 done

       2) finish the ui

 very ugly but minimally functional:
http://d.puremagic.com/test-results/pulls.ghtml

       3) trigger pull rebuilds when trunk is updated

 partly implemented, but not being done yet

 
 Idea: I noticed most pull requests were failing when I looked at it, due to
the main build failing - that's a lot of
 wasted computing time. Perhaps it would be a good idea to refuse to test pulls
if dmd HEAD isn't compiling? This would
 be problematic for the 1/100 pull requests designed to fix this, but would
save a lot of testing.
 
 An alternative method could be to test all of them, but if the pull request
previously passed, then dmd HEAD broke, then
 the pull broke, stop testing until dmd HEAD is fixed.
 

Yeah.  I know I need to do something in that space and just haven't yet.  This
whole thing is only a few evenings old
and is just now starting to really work.  I'm still focused on making sure it's
grossly functional enough to be useful.
 Optimizations and polish will wait a little longer.

Thanks,
Brad

Dec 17 2011

"Martin Nowak" <dawg dawgfoto.de> writes:

On Sat, 17 Dec 2011 20:54:24 +0100, Brad Roberts <braddr puremagic.com>  
wrote:

 On 12/17/2011 4:56 AM, Robert Clipsham wrote:
 On 17/12/2011 06:40, Brad Roberts wrote:
 On 12/16/2011 1:29 PM, Brad Anderson wrote:
 On Thu, Dec 15, 2011 at 6:43 PM, Brad  
 Roberts<braddr puremagic.com<mailto:braddr puremagic.com>>  wrote:


      Left to do:

       1) deploy changes to the tester hosts (it's on 2 already)

 done

       2) finish the ui

 very ugly but minimally functional:  
 http://d.puremagic.com/test-results/pulls.ghtml

       3) trigger pull rebuilds when trunk is updated

 partly implemented, but not being done yet

 Idea: I noticed most pull requests were failing when I looked at it,  
 due to the main build failing - that's a lot of
 wasted computing time. Perhaps it would be a good idea to refuse to  
 test pulls if dmd HEAD isn't compiling? This would
 be problematic for the 1/100 pull requests designed to fix this, but  
 would save a lot of testing.

 An alternative method could be to test all of them, but if the pull  
 request previously passed, then dmd HEAD broke, then
 the pull broke, stop testing until dmd HEAD is fixed.

 Yeah.  I know I need to do something in that space and just haven't  
 yet.  This whole thing is only a few evenings old
 and is just now starting to really work.  I'm still focused on making  
 sure it's grossly functional enough to be useful.
  Optimizations and polish will wait a little longer.

 Thanks,
 Brad

Another optimization idea. Put pull request that fail to merge on an  
inactive
list, send a comment to github and wait until the submitter does something  
about them.

martin

Dec 19 2011

Brad Roberts <braddr puremagic.com> writes:

On 12/19/2011 4:05 AM, Martin Nowak wrote:
 On Sat, 17 Dec 2011 20:54:24 +0100, Brad Roberts <braddr puremagic.com> wrote:
 
 On 12/17/2011 4:56 AM, Robert Clipsham wrote:
 On 17/12/2011 06:40, Brad Roberts wrote:
 On 12/16/2011 1:29 PM, Brad Anderson wrote:
 On Thu, Dec 15, 2011 at 6:43 PM, Brad
Roberts<braddr puremagic.com<mailto:braddr puremagic.com>>  wrote:


      Left to do:

       1) deploy changes to the tester hosts (it's on 2 already)

 done

       2) finish the ui

 very ugly but minimally functional:
http://d.puremagic.com/test-results/pulls.ghtml

       3) trigger pull rebuilds when trunk is updated

 partly implemented, but not being done yet

 Idea: I noticed most pull requests were failing when I looked at it, due to
the main build failing - that's a lot of
 wasted computing time. Perhaps it would be a good idea to refuse to test pulls
if dmd HEAD isn't compiling? This would
 be problematic for the 1/100 pull requests designed to fix this, but would
save a lot of testing.

 An alternative method could be to test all of them, but if the pull request
previously passed, then dmd HEAD broke, then
 the pull broke, stop testing until dmd HEAD is fixed.

 Yeah.  I know I need to do something in that space and just haven't yet.  This
whole thing is only a few evenings old
 and is just now starting to really work.  I'm still focused on making sure
it's grossly functional enough to be useful.
  Optimizations and polish will wait a little longer.

 Thanks,
 Brad

 
 Another optimization idea. Put pull request that fail to merge on an inactive
 list, send a comment to github and wait until the submitter does something
about them.
 
 martin

Way ahead of you, but low priority on it from a throughput standpoint.  Those
take almost no time to process.  The
benefit there is in getting the notification back to the pull submitter.  But
that's true for all failures. :)

The biggest win I can think of right now is what I'll work on next: if one
platform has failed the build, skip it unless
there's nothing else to do.  With that, only one build is wasted, and only on
the platform that's making the fastest
progress anyway.

Later,
Brad

Dec 19 2011

Brad Anderson <eco gnuk.net> writes:

On Fri, Dec 16, 2011 at 11:40 PM, Brad Roberts <braddr puremagic.com> wrote:

 On 12/16/2011 1:29 PM, Brad Anderson wrote:
 On Thu, Dec 15, 2011 at 6:43 PM, Brad Roberts <braddr puremagic.com<mailto:

 braddr puremagic.com>> wrote:
     Left to do:

      1) deploy changes to the tester hosts (it's on 2 already)

 done

      2) finish the ui

 very ugly but minimally functional:
 http://d.puremagic.com/test-results/pulls.ghtml

      3) trigger pull rebuilds when trunk is updated

 partly implemented, but not being done yet

      4) add back in support for related pull requests (ie, two pulls that
         separately fail but together succeed)

      5) consider adding updating the pull request on github with tester
         results.  This one needs to be very carefully done to not spam

 the
         report every time a build fails or succeeds.

      6) update the auto-tester grease monkey script to integrate the pull
         tester results with github's ui.

     I'll hopefully finish 1 and 2 tonight.  I can do 3 manually until

 it's
     automated.  I'm not sure about the ordering of 4-6.  They're nice to

 haves
     rather than must haves.

     All these extra builds are going to cost a lot of time.  There's

 about 100
     open pull requests right now.  The fastest runs are on the order of

 10
     minutes.  That's 6 per hour or roughly 17 hours.  The slowest are

 closer
     to an hour.  So, obviously there's some growing pains to deal with.

  I'll
     probably add a way for github committers to prioritize pull requests

 so
     they build first.

     Luckily this stuff is trivial to throw hardware at.. it's super
     parallelizeable.  Also the hardware I have for those long runs is

 super
     old.  I think the freebsd/32 box is p4 era box.  The win/32 is my

 asus eee
     atom based netbook.

     If anyone wants to volunteer build hardware, particularly for the
     non-linux platforms, please contact me via email (let's not clutter

 up the
     newsgroup with that chatter).

     Requirements:
      I need to be able to access it remotely.

      It needs to be have reliable connectivity (bandwidth is pretty much

 a
      non-issue.. it doesn't need much at all).

      It needs to be hardware you're willing to have hammered fairly hard

 at
      random times.

     I'll almost certainly write some code during the holiday to fire up

 and
     use ec2 nodes for windows and linux builds.  With the application of

 just
     a little money, all of those runs could be done fully parallel..

 which
     would just be sweet to see.  Ok, I admit it.. I love working at

 amazon on
     ec2.. and I'm happy to finally have a project that could actually

 use it.
     Later,
     Brad


 With the application of just a little money

 Out of curiosity, how much is a little?

 I'll need to experiment.  It's the kind of thing that the more money
 thrown at the problem the faster the builds could
 be churned through.  The limit of that would be one box per platform per
 pull request.  A silly (and not possible due to
 ec2 not supporting some platforms, such as osx) extreme though.  One
 c1.medium running 24x7 at the current spot price is
 about $29/month.  Spot is perfect because this is a totally interruptable
 / resumable process.  I'll need to do a little
 work to make it resume an in-flight test run, but it's not hard at all to
 do.

 After watching the build's slowly churn through the first round of build
 requests, it's clear to me that one of the
 areas I'll need to invest a good bit more time is in scheduling of pulls.
  Right now it's pseudo-random (an artifact of
 how the data is stored in a hash table).

 Later,
 Brad


That seems very reasonable.  It sounds like this autotester will help
immensely with processing the pull request backlog.  More free time for
Walter and everyone else who works on the project is a great thing.

Dec 16 2011

D Programming

C/C++ Programming

Other

digitalmars.D - Bug Prediction at Google