Intermittant disconnect and reconnect of all devices on SBC3

General PhidgetSBC Discussion.
Maculin
Phidgetsian
Posts: 11
Joined: Thu Jul 25, 2013 6:01 pm
Contact:

Intermittant disconnect and reconnect of all devices on SBC3

Postby Maculin » Sun Sep 28, 2014 11:26 am

Good Morning

I've got a Phidgets SBC3 controlling a fan at a cottage. The system needs to be able to run for long periods of time unattended. Following a power surge, I had to replace most of the Phidgets (poor protection in my design).
The re-built system was installed in early September, and I was able to test it for a week. No problem! Everything was great.
We were back up to the cottage about a week ago, and I discovered that the program had either frozen or crashed.
I took a look at the logs (written to a USB flash drive every 10 minutes, or on certain events), and noticed before the log file ended there were many entries of Phidget devices being attached. Like they kept getting unplugged and plugged back in. There were entries for ALL devices, INCLUDING the integrated interface 8/8/8.
(I was going to write it off as humidity messing with the USB connectors until I noticed the integrated 8/8/8 was doing it too).
Also, to add to the mystery, the web interface would not respond correctly (but SSH did)(web interface OK after rebooting), and after rebooting the system, the timezone and hostname of the SBC3 were messed up (They appear set in the web interface, but the clock is off by 4 hours (in UTC now), and the hostname command responds with "(none)")(Dunno if this is related).

Any ideas?

Thanks!

arpad.sooky
Phidgetsian
Posts: 9
Joined: Thu Nov 14, 2013 5:13 pm
Contact:

Re: Intermittant disconnect and reconnect of all devices on

Postby arpad.sooky » Fri Oct 10, 2014 4:59 am

Hi,

I experience exactly the same issue. In my setup I have some temperature sensors and a bark sensor connected to the SBC. I changed the code several times and tried to isolate what causes the error (and when does this occur). I figured out that before the crash/hang there is a lot of "sample overrun" errors. For a long time (days) everything runs normally, and at one point in time something goes wild, and I start to receive this error continuously for all integrated analog inputs, including those ones, which are empty:

Code: Select all

INFO  root - Interfacekit error event.Error Event (36866): Channel 0: 15 sample overrun detected.
INFO  root - Interfacekit error event.Error Event (36866): Channel 1: 15 sample overrun detected.
INFO  root - Interfacekit error event.Error Event (36866): Channel 2: 15 sample overrun detected.
INFO  root - Interfacekit error event.Error Event (36866): Channel 3: 15 sample overrun detected.
INFO  root - Interfacekit error event.Error Event (36866): Channel 4: 15 sample overrun detected.
INFO  root - Interfacekit error event.Error Event (36866): Channel 5: 15 sample overrun detected.
INFO  root - Interfacekit error event.Error Event (36866): Channel 6: 15 sample overrun detected.
INFO  root - Interfacekit error event.Error Event (36866): Channel 7: 15 sample overrun detected.


- repeated thousand times. After that the whole server app hangs, including the web server. I found a post which suggested that a local open should be used instead of network SBC open. Did not help.

I am stuck also, so any help / suggestion would be great!

Arpad

User avatar
Patrick
Lead Developer
Posts: 3039
Joined: Mon Jun 20, 2005 8:46 am
Location: Canada
Contact:

Re: Intermittant disconnect and reconnect of all devices on

Postby Patrick » Fri Oct 10, 2014 8:28 am

This looks like a hang in an event callback - what does your code do in the events?

-Patrick

Maculin
Phidgetsian
Posts: 11
Joined: Thu Jul 25, 2013 6:01 pm
Contact:

Re: Intermittant disconnect and reconnect of all devices on

Postby Maculin » Sat Oct 11, 2014 6:25 am

For myself, I am not actually using events to read the data from my various IO boards.
The only events I use are the Device Attach events for my three devices:
- Integrated 8/8/8
- Mini 2/2/2
- LCD adapter card (Drives the 4x40 display)
All I do in these events is set up the devices (set ratiometric, set sensitivity, set display size (for LCD)), and then write the event to my log file (Which is how I was able to detect the multiple connects of each device in the first place).

My program works on polling basis. The program runs in an infinite loop, doing everything about one a second.
I read the values of 4 AIs on the SBC3's integrated 8/8/8, 1 AI on a mini dongle 2/2/2*, and all 8 DIs on the integrated 8/8/8 (Some of the DIs are read several times on each pass through the loop as their state is needed in multiple places).
After a whole mountain of logic, all 8 DIs on the integrated 8/8/8 are driven, as well as the LCD display.

Obviously, I cannot comment on how Arpad is getting at his data.
However, I'm glad to know I'm not the only one having this issue.
@Arpad:
How are you getting at those error messages? They look helpful.

Thanks all!
------------------------------------------

* I'm using a mini 2/2/2 for a single AI to read an A/C current sensor (since the current sensor needs a different "Ratiometric" setting from the rest of the sensors)

arpad.sooky
Phidgetsian
Posts: 9
Joined: Thu Nov 14, 2013 5:13 pm
Contact:

Re: Intermittant disconnect and reconnect of all devices on

Postby arpad.sooky » Tue Oct 14, 2014 1:16 am

So, here is how I do it:
I set up an IK (SBC), with temperature sensors and a baro sensor. (I used to have a motion sensor, but I unplugged it, thinking, it might be the problem. It wasn't.) As in Maculin's case, the setup code is in the device attached event.
This is my sensor change event:

Code: Select all

@Override
            public void sensorChanged(SensorChangeEvent sce) {
               switch (sce.getIndex()) {
               case MOTION_PORT:
                  try {
                     if (!TPStatus.isSecurityOn()) {
                        break;
                     }
                     long time = System.currentTimeMillis();
                     if ((time - motionTimer) < 500) {
                        break;
                     }
                     motionTimer = time;
                     java.util.Date currentTime = new java.util.Date();
                     SimpleDateFormat df = new SimpleDateFormat("HH:mm");
                     String dateString = df.format(currentTime);
                     logger.info("Motion detected at " + dateString + ". (" + ifk.getSensorValue(MOTION_PORT) + ")");
                     TPStatus.sendAlarm("Motion detected - " + dateString);
                  } catch (PhidgetException e) {
                     // TODO Auto-generated catch block
                     e.printStackTrace();
                  }
                  break;

               default:
                  break;
               }
               
            }


The TPStatus.isSecurityOn() is always OFF for now, so it always breaks immediately after entering this method.

Besides this, I start a periodic task in the IK attach event using the ScheduledThreadPoolExecutor, where I read the temperature sensor data every X minutes, and save the values in a sqlite DB.

As wrote before, this works fine for a few days. Then I get the errors, and my application is non responsive.

Arpad

User avatar
Patrick
Lead Developer
Posts: 3039
Joined: Mon Jun 20, 2005 8:46 am
Location: Canada
Contact:

Re: Intermittant disconnect and reconnect of all devices on

Postby Patrick » Tue Oct 14, 2014 8:40 am

Is your code running on the SBC, or on a PC with openRemote?

This sounds like there could be a memory leak in your program - if it's running on the SBC.

-Patrick

arpad.sooky
Phidgetsian
Posts: 9
Joined: Thu Nov 14, 2013 5:13 pm
Contact:

Re: Intermittant disconnect and reconnect of all devices on

Postby arpad.sooky » Wed Oct 15, 2014 1:46 am

It is running on an SBC3. I`ll check for memory leaks today.

Maculin
Phidgetsian
Posts: 11
Joined: Thu Jul 25, 2013 6:01 pm
Contact:

Re: Intermittant disconnect and reconnect of all devices on

Postby Maculin » Wed Oct 15, 2014 5:20 pm

Mine is also running directly on the SBC3.

I ran it for about a week straight at home before deploying it at the cottage, writing down memory consumption every once in a while. I'm fairly certain it does not have a memory leak.

More Info:
After the failure described in my initial post, I disconnected the Interface Kit 2/2/2 and LCD adapter (Parts of the program can function correctly without them).
We were at the cottage again a few days ago.
We checked the log file, and there was NO indication of the problem; the program was running and the data was being recorded to the flash drive every 10 minutes like it should have, with no records of Device Attach events (except startup, of course).

I don't know if that helps or not, or just makes things more confusing.

ChrisEdgington
Phidgetsian
Posts: 5
Joined: Thu Sep 26, 2013 4:21 pm
Contact:

Re: Intermittant disconnect and reconnect of all devices on

Postby ChrisEdgington » Tue Jan 13, 2015 8:53 pm

Did you figure out your USB problem? I had a similar problem, but I was using an external 5-port USB hub attached to an 8/8/8. The devices I had hooked up to the hub must have been occasionally drawing too much current because my problem was solved completely by adding a power supply to the hub. So, I was using the hub as bus-powered, and was having random all-device disconnects and reenumeration. Then I added a power supply to the hub and all of those problems went away. Maybe try attaching an external powered up to the SBC3 and move some of your devices to that hub.

-Chris

Maculin
Phidgetsian
Posts: 11
Joined: Thu Jul 25, 2013 6:01 pm
Contact:

Re: Intermittant disconnect and reconnect of all devices on

Postby Maculin » Tue Dec 01, 2015 12:01 pm

(Sorry for the long delay. The project kinda got put on the back burner)

In attempting to work out the problem, I brought the system home, and have it running on my desk. I have found that when this problem crops up, it fills the system logs with lots of messages.
Unfortunately, I'm not quite a Linux guru, and don't know what they mean:

kern.log:

Code: Select all

Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.040000] [<c01ada24>] (ubifs_write_begin+0x258/0x518) from [<c0068cc0>] (generic_file_buffered_write+0xe8/0x280)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.050000] [<c0068cc0>] (generic_file_buffered_write+0xe8/0x280) from [<c006a4a0>] (__generic_file_aio_write+0x3f0/0x434)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.060000] [<c006a4a0>] (__generic_file_aio_write+0x3f0/0x434) from [<c006a558>] (generic_file_aio_write+0x74/0xdc)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.070000] [<c006a558>] (generic_file_aio_write+0x74/0xdc) from [<c01acd5c>] (ubifs_aio_write+0x178/0x190)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.080000] [<c01acd5c>] (ubifs_aio_write+0x178/0x190) from [<c009b434>] (do_sync_write+0x90/0xcc)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.090000] [<c009b434>] (do_sync_write+0x90/0xcc) from [<c009c038>] (vfs_write+0xac/0x184)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.100000] [<c009c038>] (vfs_write+0xac/0x184) from [<c009c1bc>] (sys_write+0x3c/0x68)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.110000] [<c009c1bc>] (sys_write+0x3c/0x68) from [<c000eca0>] (ret_fast_syscall+0x0/0x2c)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.120000] [<c00144cc>] (unwind_backtrace+0x0/0xf0) from [<c01c5280>] (ubifs_budget_space+0x298/0x6c4)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.130000] [<c01c5280>] (ubifs_budget_space+0x298/0x6c4) from [<c01accbc>] (ubifs_aio_write+0xd8/0x190)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.140000] [<c01accbc>] (ubifs_aio_write+0xd8/0x190) from [<c009b434>] (do_sync_write+0x90/0xcc)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.150000] [<c009b434>] (do_sync_write+0x90/0xcc) from [<c009c038>] (vfs_write+0xac/0x184)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.160000] [<c009c038>] (vfs_write+0xac/0x184) from [<c009c1bc>] (sys_write+0x3c/0x68)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.170000] [<c009c1bc>] (sys_write+0x3c/0x68) from [<c000eca0>] (ret_fast_syscall+0x0/0x2c)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.180000] [<c00144cc>] (unwind_backtrace+0x0/0xf0) from [<c01c4dbc>] (ubifs_release_budget+0x3b4/0x534)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.190000] [<c01c4dbc>] (ubifs_release_budget+0x3b4/0x534) from [<c01acd44>] (ubifs_aio_write+0x160/0x190)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.200000] [<c01acd44>] (ubifs_aio_write+0x160/0x190) from [<c009b434>] (do_sync_write+0x90/0xcc)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.210000] [<c009b434>] (do_sync_write+0x90/0xcc) from [<c009c038>] (vfs_write+0xac/0x184)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.220000] [<c009c038>] (vfs_write+0xac/0x184) from [<c009c1bc>] (sys_write+0x3c/0x68)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.230000] [<c009c1bc>] (sys_write+0x3c/0x68) from [<c000eca0>] (ret_fast_syscall+0x0/0x2c)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.250000] [<c00144cc>] (unwind_backtrace+0x0/0xf0) from [<c01c5280>] (ubifs_budget_space+0x298/0x6c4)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.260000] [<c01c5280>] (ubifs_budget_space+0x298/0x6c4) from [<c01accbc>] (ubifs_aio_write+0xd8/0x190)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.270000] [<c01accbc>] (ubifs_aio_write+0xd8/0x190) from [<c009b434>] (do_sync_write+0x90/0xcc)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.270000] [<c009b434>] (do_sync_write+0x90/0xcc) from [<c009c038>] (vfs_write+0xac/0x184)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.280000] [<c009c038>] (vfs_write+0xac/0x184) from [<c009c1bc>] (sys_write+0x3c/0x68)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.290000] [<c009c1bc>] (sys_write+0x3c/0x68) from [<c000eca0>] (ret_fast_syscall+0x0/0x2c)
Sep 22 19:31:13 phidgetsbc-2014 kernel: [771360.310000] [<c00144cc>] (unwind_backtrace+0x0/0xf0) from [<c01c4dbc>] (ubifs_release_budget+0x3b4/0x534)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771360.320000] [<c01c4dbc>] (ubifs_release_budget+0x3b4/0x534) from [<c01acd44>] (ubifs_aio_write+0x160/0x190)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.210000] [<c01c5280>] (ubifs_budget_space+0x298/0x6c4) from [<c01accbc>] (ubifs_aio_write+0xd8/0x190)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.220000] [<c01accbc>] (ubifs_aio_write+0xd8/0x190) from [<c009b434>] (do_sync_write+0x90/0xcc)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.230000] [<c009b434>] (do_sync_write+0x90/0xcc) from [<c009c038>] (vfs_write+0xac/0x184)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.240000] [<c009c038>] (vfs_write+0xac/0x184) from [<c009c1bc>] (sys_write+0x3c/0x68)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.240000] [<c009c1bc>] (sys_write+0x3c/0x68) from [<c000eca0>] (ret_fast_syscall+0x0/0x2c)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.260000] [<c00144cc>] (unwind_backtrace+0x0/0xf0) from [<c01c4dbc>] (ubifs_release_budget+0x3b4/0x534)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.270000] [<c01c4dbc>] (ubifs_release_budget+0x3b4/0x534) from [<c01acd44>] (ubifs_aio_write+0x160/0x190)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.280000] [<c01acd44>] (ubifs_aio_write+0x160/0x190) from [<c009b434>] (do_sync_write+0x90/0xcc)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.290000] [<c009b434>] (do_sync_write+0x90/0xcc) from [<c009c038>] (vfs_write+0xac/0x184)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.300000] [<c009c038>] (vfs_write+0xac/0x184) from [<c009c1bc>] (sys_write+0x3c/0x68)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.310000] [<c009c1bc>] (sys_write+0x3c/0x68) from [<c000eca0>] (ret_fast_syscall+0x0/0x2c)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.320000] [<c00144cc>] (unwind_backtrace+0x0/0xf0) from [<c01c5280>] (ubifs_budget_space+0x298/0x6c4)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.330000] [<c01c5280>] (ubifs_budget_space+0x298/0x6c4) from [<c01ada24>] (ubifs_write_begin+0x258/0x518)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.340000] [<c01ada24>] (ubifs_write_begin+0x258/0x518) from [<c0068cc0>] (generic_file_buffered_write+0xe8/0x280)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.350000] [<c0068cc0>] (generic_file_buffered_write+0xe8/0x280) from [<c006a4a0>] (__generic_file_aio_write+0x3f0/0x434)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.360000] [<c006a4a0>] (__generic_file_aio_write+0x3f0/0x434) from [<c006a558>] (generic_file_aio_write+0x74/0xdc)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.370000] [<c006a558>] (generic_file_aio_write+0x74/0xdc) from [<c01acd5c>] (ubifs_aio_write+0x178/0x190)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.380000] [<c01acd5c>] (ubifs_aio_write+0x178/0x190) from [<c009b434>] (do_sync_write+0x90/0xcc)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.390000] [<c009b434>] (do_sync_write+0x90/0xcc) from [<c009c038>] (vfs_write+0xac/0x184)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.400000] [<c009c038>] (vfs_write+0xac/0x184) from [<c009c1bc>] (sys_write+0x3c/0x68)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.410000] [<c009c1bc>] (sys_write+0x3c/0x68) from [<c000eca0>] (ret_fast_syscall+0x0/0x2c)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.440000] [<c00144cc>] (unwind_backtrace+0x0/0xf0) from [<c01c5280>] (ubifs_budget_space+0x298/0x6c4)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.450000] [<c01c5280>] (ubifs_budget_space+0x298/0x6c4) from [<c01accbc>] (ubifs_aio_write+0xd8/0x190)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.460000] [<c01accbc>] (ubifs_aio_write+0xd8/0x190) from [<c009b434>] (do_sync_write+0x90/0xcc)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.470000] [<c009b434>] (do_sync_write+0x90/0xcc) from [<c009c038>] (vfs_write+0xac/0x184)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.470000] [<c009c038>] (vfs_write+0xac/0x184) from [<c009c1bc>] (sys_write+0x3c/0x68)
Sep 22 19:31:14 phidgetsbc-2014 kernel: [771363.480000] [<c009c1bc>] (sys_write+0x3c/0x68) from [<c000eca0>] (ret_fast_syscall+0x0/0x2c)
Sep 22 19:31:15 phidgetsbc-2014 kernel: [771363.500000] [<c00144cc>] (unwind_backtrace+0x0/0xf0) from [<c01c4dbc>] (ubifs_release_budget+0x3b4/0x534)
Sep 22 19:31:16 phidgetsbc-2014 kernel: [771363.510000] [<c01c4dbc>] (ubifs_release_budget+0x3b4/0x534) from [<c01acd44>] (ubifs_aio_write+0x160/0x190)
Sep 22 19:31:16 phidgetsbc-2014 kernel: [771363.520000] [<c01acd44>] (ubifs_aio_write+0x160/0x190) from [<c009b434>] (do_sync_write+0x90/0xcc)
Sep 22 19:31:16 phidgetsbc-2014 kernel: [771363.530000] [<c009b434>] (do_sync_write+0x90/0xcc) from [<c009c038>] (vfs_write+0xac/0x184)
Sep 22 19:31:16 phidgetsbc-2014 kernel: [771363.540000] [<c009c038>] (vfs_write+0xac/0x184) from [<c009c1bc>] (sys_write+0x3c/0x68)
Sep 22 19:31:16 phidgetsbc-2014 kernel: [771363.540000] [<c009c1bc>] (sys_write+0x3c/0x68) from [<c000eca0>] (ret_fast_syscall+0x0/0x2c)
Sep 22 19:31:16 phidgetsbc-2014 kernel: [771363.560000] [<c00144cc>] (unwind_backtrace+0x0/0xf0) from [<c01c5280>] (ubifs_budget_space+0x298/0x6c4)
Sep 22 19:31:16 phidgetsbc-2014 kernel: [771365.320000] [<c006a4a0>] (__generic_file_aio_write+0x3f0/0x434) from [<c006a558>] (generic_file_aio_write+0x74/0xdc)
Sep 22 19:31:16 phidgetsbc-2014 kernel: [771365.330000] [<c006a558>] (generic_file_aio_write+0x74/0xdc) from [<c01acd5c>] (ubifs_aio_write+0x178/0x190)
Sep 22 19:31:16 phidgetsbc-2014 kernel: [771365.340000] [<c01acd5c>] (ubifs_aio_write+0x178/0x190) from [<c009b434>] (do_sync_write+0x90/0xcc)
Sep 22 19:31:16 phidgetsbc-2014 kernel: [771365.350000] [<c009b434>] (do_sync_write+0x90/0xcc) from [<c009c038>] (vfs_write+0xac/0x184)
Sep 22 19:31:16 phidgetsbc-2014 kernel: [771365.360000] [<c009c038>] (vfs_write+0xac/0x184) from [<c009c1bc>] (sys_write+0x3c/0x68)
Sep 22 19:31:16 phidgetsbc-2014 kernel: [771365.360000] [<c009c1bc>] (sys_write+0x3c/0x68) from [<c000eca0>] (ret_fast_syscall+0x0/0x2c)




One good things I have discovered: The problem with the web interface acting up and the time zone being wrong was a symptom of the logs getting so full they filled up system memory. Clearing the system logs and re-booting the SBC fixed those problems.


Return to “General”

Who is online

Users browsing this forum: No registered users and 1 guest