Jump to content
C4 Forums | Control4

Switch


alf1096

Recommended Posts

EagleMoon' date=' how would you interpret the RMON stats I posted in #41 above? Any ideas for next steps on troubleshooting?[/quote']

Sorry. I had written a response yesterday but got interrupted apparently before posting it.

You don't have any errors so it's not a duplex mismatch, assuming the stats are valid. If there were a mismatch, the "HALF" (or "AUTO") would be recording collisions. The "FULL" end would be showing every other kind of error.

The "Drop Events" presumably means the switch was having to drop frames for some reason on that port. I would guess that was due to congestion in the switch. The actual percentage of drops compared to total frames is miniscule, but if they all happened in a short burst it could still be a problem. You'd have to monitor periodically to see if it appears to be a bursty problem or more random.

In any case, I'm not sure what would cause that in a practical sense; are some of the ports running at different speeds? Do you still have flow control enabled somewhere on the switch? Is it enabled on the WHS? Is it possible that some of your devices are set to use jumbo frames and the switch or other devices are not?

Hmmm....I assume those were outbound drops by the switch on egress from the switch's WHS port (as opposed to drops of frames inbound from WHS because their egress port was congested), so the problem would seem to be congestion at the WHS, especially if it has flow control enabled. This could be the cause: You might be dumping so much data to the WHS (backups from clients?) that its disk can't keep up. Then the WHS input buffers would fill up at the TCP/IP stack and could trigger flow control back toward the switch, ultimately causing the switch to drop frames because its own buffers began to fill.

If your WHS is "I/O bound" for some reason -- lots of disk thrashing -- then that would also affect it's throughput for outbound traffic (like playing a movie, if that's one of your symptoms). Enabling or disabling flow control might or might not help you if that's the case. Digging into WHS performance is over my head. As a quick and dirty test, you might start with turning off all applications on all other computers that use the WHS NAS except for the one most critical and most often exhibiting the problem and see if the problem goes away.

Thanks, that was very helpful and you confirmed some of my suspicions. I should have mentioned - the WHS has an SATA raid card (the mobo is an older IDE version). The card had 2 x 2Tb drives on it. I also had an external USB 2.0 drive hooked up. Today I put in 2 more 2Tb drives so I've now got 4x2Tb on that RAID card, and I pulled the external USB drive. I also put in a new NIC for good measure.

My thinking is that I'll be able to isolate either the RAID card or the USB drive as potential bottlenecks. Your comments about the backup from clients is what made me suspicious. I think either the RAID card or the USB drive were to blame. My suspicion is the RAID card. If I'm right, it likely means a new Mobo, but better to fix it now.

....more testing to come...

Link to comment
Share on other sites


  • Replies 53
  • Created
  • Last Reply

Update: watched Megamind with the kids last night streaming from the WHS to the Dune and it was flawless. Not one hiccup.

The only other problem I have right now is my main Desktop machine is not backing up to the WHS, but my Laptop is, which seems odd. I used to have both attached to the router instead of directly to the switch. I've changed that so now everything is sitting on the switch. I'm not getting any "drop events" anymore, but the backup is still failing. Both the WHS and Desktop still have flow control enabled, but they have always been that way and I've never had a problem before. I'll have to dig into the logs and see if I can figure out what's happening - it may be unrelated to network issues.

Thanks EagleMoon for your help. Your thoughts and suggestions have definitely improved my network!

EDIT: Just ran a manual backup on my Desktop and it worked! No issues. So it must not be waking the machine up properly for the nightime routine.

Link to comment
Share on other sites

improved my network

So your changes were:

1. increased number of RAID drives, doubling storage

2. removed USB drive

3. replaced NIC

4. moved all devices to ports on main switch rather than router switch ports

To help nail down the cause, it would be interesting to re-attach the USB drive to see if problem returns. Then at a different time, replace the original NIC to see if problem returns. If no problems then you could try putting devices back on the router NIC, though I doubt that caused it.

Was the RAID array nearly full? Even if it were, adding more storage shouldn't affect READ performance, should it?

My bet is on the NIC or the USB drive, though. I'm wondering if it periodically had to spin up the USB drive and suffer an I/O wait while that happened. I used to see that on my Macs when I had certain USB drives attached even if I wasn't doing anything that should have accessed them.

Link to comment
Share on other sites

Yes, that is an accurate summary of the changes I made.

I was thinking the same thing about adding back the USB drive. That is my strong suspicion for a couple of reasons:

1. The RAID was starting to fill up (that's why I added more drives), so maybe WHS doesn't start adding to the USB drive until it runs out of internal drives

2. The WHS had reported some errors connecting to the USB drive in the logs. I think you're right about it "waking up".

3. Related to the above, the backups were working fine until just recently (as were movies), which supports the USB theory - maybe its hitting the USB for certain files/backup and that's when it fails

It could also be a problem in the USB I/O of the motherboard. But all roads definitely point to the USB drive...

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.


×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.