[Wien] [Olsr-users] pre-Xmas Bug Hunting and other stuff

Bernd Petrovitsch (spam-protected)
Fr Dez 21 12:37:25 CET 2007


OK, I'm overworked and need holidays. Now in the Cc: the correct mailing
list addresses (and thus a full quote):

On Fre, 2007-12-21 at 12:23 +0100, Bernd Petrovitsch wrote:
> Hi all!
> 
> I was quite quiet the last time - mainly because of my day job also
> needs attention (and pre Xmas time in Vienna implies various Xmas
> parties and meeting people for a beer;-).
> 
> More serious:
> 
> We (at least Hannes Gredler and /me) *thought* that it is a good point
> (and time) for a release (and several others didn't disagree:-).
> Sven-Ola Tücke has some build fixes and cleanups for the Windows in
> some patches which can/should IMHO go in before.
> 
> We are also in the process of migrating from CVS (on sf.net) to
> Mercurial (on sf.net). BTW that was driven primarily by Hannes.
> The benefits are:
> - since Mercurial is one this modern distributed SCMs it is for
>   developers easier to mover changesets around. 
> - since the anonymous access to the main repository[0] will go over a
>   CGI script on the sf.net server, there shouldn't be any delay (of
>   several hours) between a commit mail and the actual change in the
>   publicly visible repository.
> - Mercurial automagically provides an RSS feed out of a repository.
> - Since I'm personally quite mail-centric, we will also send emails on
>   changes to the main repository.
>   To minimize changes and effort on all sides, I intend to keep even the
>   (spam-protected) mailing list for commit mails.
> 
> For a smooth transition, we need to update the documentation on
> http://www.olsr.org/ - including a simple introduction for people
> knowing CVS (WTH - I'm such a person). This should happen over Xmas
> holiday time.
> 
> Why the above *thought*:
> The FunkFeuer net in Vienna started upgrade several nodes to 0.5.4 -
> including the gateway to the erst of the Internet. However, we
> experienced route flaps in the net afterwards.
> 
> Summarizing from the internal FunkFeuer core list (in the Cc:, which is
> also German otherwise):
> 
> It turned out that once in a while (the "while" can be AFAIK from a few
> minutes to lots of minutes) olsrd decides that one neighbor is not
> reachable (read: ETX == 0) and drops all routes to it. After a while,
> the connection is back and all routes are installed (and everything was
> as before).
> And that is quite noticable if that "dropped" neighbor is the main link
> to the Internet gateway.
> And this also happens on openvpn tunneled connections which are usually
> more like ETX == 1.00.
> 
> The thread on http://www.freifunk-bno.de/forum/index.php?topic=930.0 (in
> German, found via Google) seems to be BTW the similar problem.
> 
> Reverting to 0.5.0 on the Internet gateway solved (or at least seems to)
> the problem. So it seems to have to do with the olsrd version - and thus
> the decision to postpone a release until the clause of that it is clear.
> 
> ATM no one knows (to the best of my knowledge) if that is a new bug in
> the implementation or combination of (vastly?) different versions or
> something hidden which is now only coming up or something completely
> different.
> 
> The main question IMHO is: Why is olsrd deciding that ETX == 0 at some
> point in time even on a stable link?
> 
> That requires adding debug code to a known br0ken version (above thread
> indicates the e.g. CVS-HEAD is one) and find the cause of it on a node
> that reliably shows that issue.
> 
> If you are not so into programming and debugging:
> It would also help if someone experiencing such problem reliably and
> quickly could find the point in time in the CVS were it started to
> happen - or at least the two points where is definit^Wmost certainly not
> in (at or after 0.5.0) or - later on - is there (at or before 0.5.4).
> then everyone can look at the code.
> 
> That requires getting some in-between version from the CVS and trying it
> out and see if it occurs. Write that down and take a later or an earlier
> one. Ideally one takes a center of the remaining interval to minimize
> the tries.
> Repeat until you feel you know the above result.
> 
> Further helpful information is of course also welcome. Including
> corrections of above if I missed something or misunderstand something.
> 
> 	Bernd
> 
> [0]: I have to admit that I forget to ask Hannes what the official term
> for that in Mercurial speak is;-)
> -- 
> Firmix Software GmbH                   http://www.firmix.at/
> mobil: +43 664 4416156                 fax: +43 1 7890849-55
>           Embedded Linux Development and Services
> 
> 
> 
> -- 
> Olsr-users mailing list
> (spam-protected)
> http://lists.olsr.org/mailman/listinfo/olsr-users
-- 
Firmix Software GmbH                   http://www.firmix.at/
mobil: +43 664 4416156                 fax: +43 1 7890849-55
          Embedded Linux Development and Services






Mehr Informationen über die Mailingliste Wien