SBC webservice stops

General PhidgetSBC Discussion.
glenn
Phidgeteer!
Posts: 93
Joined: Sun Sep 05, 2010 4:42 pm
Contact:

Re: SBC webservice stops

Postby glenn » Fri Aug 30, 2013 12:46 pm

Have you been able to reproduce the USB errors?

User avatar
Patrick
Lead Developer
Posts: 3038
Joined: Mon Jun 20, 2005 8:46 am
Location: Canada
Contact:

Re: SBC webservice stops

Postby Patrick » Fri Aug 30, 2013 1:01 pm

But, but I expect that I will be able to when I can devote some more time.

I'll be out of office for 2 weeks starting Sept. 5th. I probably won't be able to look into this further until after I get back.

-Patrick

glenn
Phidgeteer!
Posts: 93
Joined: Sun Sep 05, 2010 4:42 pm
Contact:

Re: SBC webservice stops

Postby glenn » Fri Aug 30, 2013 1:28 pm

Patrick: OK, I assume you meant "No, but". :)

Enjoy your vacation. Hopefully will be able to report some progress in understanding by the time you return.

Andre: If you're still reading this thread, would you be willing to get involved in some testing of this? It would require running the diagnostic client on your setup and collecting some results. The diagnostic can run unattended, but would probably require half an hour or so of your time getting it set up initially. If so, let me know and I will post an updated version of the wsdiag package this weekend or early next week. It's pretty self-contained and documented, it should only require modifying a few #defines to work on your setup and then a little while to play with it and learn how to use it.

The downside, unfortunately, is that running the diagnostic would certainly intefere with whatever it is you're actually doing with your SBC, so perhaps may not be feasible for you to run it.

Anyway, let me know, we can communicate further about it.

Good weekend to you,
Glenn

glenn
Phidgeteer!
Posts: 93
Joined: Sun Sep 05, 2010 4:42 pm
Contact:

Re: SBC webservice stops

Postby glenn » Tue Sep 03, 2013 11:50 am

Hi Patrick,

Have a favor to ask regarding debug of this WSJS issue: Would it be possible to loan me a couple of Phidgets for debug purposes?

The reason I'm asking is as follows: So far, I have been doing all WSJS debugging on the SBC itself. This is painful due to memory restrictions, slow builds, no per-thread gdb, etc. But if this thread interference problem is indeed at the root cause, then it really should not be SBC-specific, but should also occur if I were to run WS on a "real" host (e.g. a spare laptop) and then exercise it using the client on my other laptop. The problem is that so far, I can't get the failure mode to occur (even on SBC) unless two phidgets are involved. (I think this is because it is the interplay of the read/write threads for both Phidgets that's at the root of the issue.)

When I run WS on my SBC1, I get the PH1070 IK 8/8/8 "for free", and it also has an external PH1014, so that gives me the two Phidgets necessary to tickle the problem and get it to happen. But I've tried repeatedly to get it to occur with only one Phidget, but so far, no luck, so can't debug on the laptop with just one. (If I could get it to happen with just one Phidget, I'd take the PH1014 out of my SBC setup and use it on the spare laptop for debugging.)

So, if I had a couple of external Phidgets that I could attach to my spare laptop, I think I could probably get the problem to show up running WS there, and could then bring to bear the full debug capabilities of the laptop, including thread-aware gdb, valgrind, blah blah. etc. I'm pretty sure I could get to the bottom of the thread-interference issue pretty quickly in that environment, whereas in the SBC environment, it's limited to printf()s and some simple gdb work, and it's painfully slow.

The particular Phidgets don't matter at all: The only requirement is that each Phidget have at least one digital bit that can be both read and written, so that the diagnostic client is able to determine whether the get/setXxx() functions are actually operating properly. They could be used, damaged, lab stock, whatever, as long as they actually work and have at least one digital R/W bit.

Let me know if this would be feasible for you, it would be a great help.

Thanks,
Glenn

glenn
Phidgeteer!
Posts: 93
Joined: Sun Sep 05, 2010 4:42 pm
Contact:

Re: SBC webservice stops

Postby glenn » Fri Sep 06, 2013 11:09 am

Patrick, Brian: Thanks very much for the loaner Phidgets and the quick shipment! They arrived yesterday and work fine. (One of them has S/N 99999... felt like I should go to a casino... :) [Yes, I know, it's probably just the default S/N, but kinda cool anyway... :) ]

So anway... now for the bad news: Running WS on my spare laptop (also ArchLinux) and using the two loaner Phidgets, the WSJS problem does not seem to occur at all, unfortunately. In about 100 runs (similar to those in "expt03" on the SBC, documented earlier) roughly 50 of which had the problematic {open,close} ordering after a link outage -- which, on the SBC1, results in WSJSes about half the time -- there were no WSJSs at all. In fact, no failure modes of any sort. The laptop WS recovered cleanly every time. No USB errors, no thread peculiarities, no sign of any trouble whatsoever.

So... that's unfortunate. Was hoping for similar or identical failure modes as on the SBC

But, on the plus side, the differences between laptop and SBC1 behavior may offer some clues nevertheless. For example: There are some peculiar differences in the behavior of the client-side HB event handler and HB monitor, when working with WS on SBC1 vs. WS on the other laptop. When talking to the laptop WS, the HB events are evenly spaced, every 4 seconds. But when talking to the SBC1 WS, the HB events are very erraticly spaced, ranging from a few seconds, to more than a minute. So again, even though this is not directly related to the WSJS issue, the differences between SBC1 and laptop behavior may offer some clues as to what's going on the SBC1.

Anyway... more this weekend when I have more time to look into it in detail.

Again, thanks for the quick response on the Phidgets. I think we'll learn something useful from them even if not what was intended.

Regards,
Glenn

glenn
Phidgeteer!
Posts: 93
Joined: Sun Sep 05, 2010 4:42 pm
Contact:

Re: SBC webservice stops

Postby glenn » Mon Sep 23, 2013 10:02 am

Hi Patrick,

Hope you enjoyed (or are still enjoying) your vacation.

If/when you get to the point of looking further into the WSJS issue, reproducing fault modes, etc., here's a summary of the main points of what I've learned here over the past month or so.

http://misc.postpro.net/phidgets/summary_20130923/

It includes some new and interesting results obtained using the loaner Phidgets you sent (which have been very helpful, thanks.)

Let me know if you have any questions.

Regards,
Glenn

User avatar
Patrick
Lead Developer
Posts: 3038
Joined: Mon Jun 20, 2005 8:46 am
Location: Canada
Contact:

Re: SBC webservice stops

Postby Patrick » Mon Sep 23, 2013 10:34 am

Hi,

I am back. I'll have a look at your results.

-Patrick

glenn
Phidgeteer!
Posts: 93
Joined: Sun Sep 05, 2010 4:42 pm
Contact:

Re: SBC webservice stops

Postby glenn » Mon Sep 23, 2013 12:11 pm

Welcome back!

Will be interesting to see if you can reproduce the OOB access faults with stock code on an x86 box. That alone would be of good value.

Btw, an erratum in that writeup: "PH1124 voltage divider" should be "PH1121".

User avatar
Patrick
Lead Developer
Posts: 3038
Joined: Mon Jun 20, 2005 8:46 am
Location: Canada
Contact:

Re: SBC webservice stops

Postby Patrick » Mon Sep 23, 2013 1:13 pm

Yes. I almost always recompile the library/webservice with '-O0 -g' before I start any debugging, so not surprising that I missed this one..

-Patrick

glenn
Phidgeteer!
Posts: 93
Joined: Sun Sep 05, 2010 4:42 pm
Contact:

Re: SBC webservice stops

Postby glenn » Mon Oct 28, 2013 6:28 pm

Squeak-squeak. :)

What's your take on getting any further attention to this issue? Is it dead, pending the rewrite that you mentioned earlier? Or is there a chance we can get a fix into the existing code?


Return to “General”

Who is online

Users browsing this forum: No registered users and 2 guests