Posts Tagged ‘Incident Report’

Incident report: 50% of ActiveSync connections failed..

Posted on: June 17th, 2025 by Mads Petersen

On Sunday 2025-06-15 one of the Kolab now ActiveSync servers were getting overloaded and got stuck on failed jobs. In turn, about 50% of the ActiveSync jobs were failing – leading to some ActiveSync users loosing the connections to Outlook or to their mobile devices.

> Continue Reading

Tags: ActiveSync, Incident Report

Incident report: Some mails to Microsoft online services was getting blocked..

Posted on: May 7th, 2025 by Mads Petersen

Yesterday we dealt with a spammer/phisher who specifically targeted the Microsoft outlook.com service (and it’s affiliates). One of the Kolab Now MX servers was listed on the Microsoft throttle list (S3150). This meant that some users saw, that mails sent to recipients at ‘@outlook.com’, ‘@live.com’, ‘@hotmail.com’, and other Microsoft online services was bounced back.

> Continue Reading

Tags: Anti-Spam, Incident Report, SPAM, Support

Incident report: Some mails to Microsoft online services was getting blocked..

Posted on: February 13th, 2025 by Mads Petersen

This afternoon earlier today one of the Kolab Now MX servers was listed on the Microsoft block list. This means that some users might have seen, that mails sent to recipients at ‘@outlook.com’, ‘@live.com’, ‘@hotmail.com’, and other Microsoft online services was bounced back with the message that looks something like this:

This is the mail system at host mx.kolabnow.com. 
I'm sorry to have to inform you that your message could not 
be delivered to one or more recipients. It's attached below. 
For further assistance, please send mail to postmaster. 
If you do so, please include this problem report. You can 
delete your own text from the attached returned message. 
The mail system <some-email@outlook.com>: host  
outlook-com.olc.protection.outlook.com[x.x.x.x] said: 550 5.7.1 
Unfortunately, messages from [y.y.y.y] weren't sent. Please contact 
your Internet service provider since part of their network is on our block 
list (S3150). You can also refer your provider to 
http://mail.live.com/mail/troubleshooting.aspx#errors. [Name=Protocol 
Filter Agent][AGT=PFA][MxId=<some long number>] 
[SG2PEPF03345FBECA.apcprd05.prod.outlook.com 2025-02-13T<timestamp>Z 
<another long number>] (in reply to MAIL FROM command)

Although the listing was fast discovered, Microsoft was contacted and the listing is reversed as soon as it is possible, it took a while. At this time emails should be delivered to the Microsoft online services.

A few users has misinterpreted the symptoms with error messages from missing the DKIM changes made on Monday (please read this blog post from December 2024 and the follow ups). If you are a group manager, then please make sure that you have the new DKIM related CNAMES added to your DNS zone.

If you have any questions or concerns in this context, then please contact support.

Tags: Anti-Spam, blog, DKIM, DNS, Incident Report, Security, SPAM, Support

Incident report: Spam filtering overflow filling up disk..

Posted on: December 19th, 2024 by Mads Petersen

On Wednesday 2024-12-18 early evening (~19:00 UTC) a spammer attempted to use a Kolab Now account for sending out large amounts of spam. The Kolab Now exit spam filter was sorting out the spam and redirecting it, as it was supposed to do, and none of the spam was sent out. The spammer was however stubborn and kept up the sending, which subsequently was filling up a disk and hence blocking traffic. Due to ongoing maintenance on the monitoring, the full disk was unfortunately not discovered until Thursday morning, when the problem was immediately corrected, and queued mails were again flowing in both directions.

The problem caused a group of users (about 30%) to be unable to receive mail, and sent mail was queued until the space was again freed up and spooling was possible. No mail should have been lost during the incident.

The missing monitoring has been put back into action, and the Kolab Now Engineering team is evaluating changes that will prevent the situation from repeating.

We apologize for any inconvenience that this incident may have caused.

Tags: Incident Report, SPAM

Incident: No internal email delivery..

Posted on: November 25th, 2024 by Mads Petersen

During the investigation of another issue on the Kolab Now front-end servers, email delivery was broken on one (of many) frontend server. This caused delivered emails to be queued on an internal MX server for a group of users. When the issue was discovered, it was quickly resolved, and emails were again distributed to the user inboxes.

The queuing of emails started at 2024-11-22@21:56 UTC and lasted until 2024-11-23@15:02. Emails were delayed – not dropped. No emails were lost. All emails have been delivered at this time.

Monitoring has been put in place detecting this and similar delivery issues going forward.

We apologize for any inconvenience that this may have caused.

Tags: Incident Report, Monitoring

Incident: Service outage

Posted on: November 6th, 2024 by Christian Mollekopf

Kolabnow is currently experiencing a networking infrastructure interruption. We apologize for the inconvenience while we investigate the issue.
We will update this blogpost as soon as more information is available.

2024-11-06 @ 05:44 UTC: This issue was triggered by one of our hypervisors spontaneously rebooting. Most services have been restored, the Operations team is working through remaining issues. Users can login and use the facilities.
2024-11-06 @ 07:17 UTC: The incident has been resolved.

Tags: Incident Report

Incident: DATABASE ERROR!

Posted on: October 29th, 2024 by Mads Petersen

A database issue has just presented itself, and our operations team is investigating to find the cause and fix it.

Users who try to make use of the webclient will get the message:

DATABASE ERROR!
Unable to connect to the database!
Please contact your server-administrator.

Operations have a good lead on the issue, and we expect everything to be back online shortly.

You can follow the situation here on this blog.

2024-10-29 @ 10:22 UTC: The root cause of the issue has been identified to be a problem with a synch routine in the database cluster. The Operations team is working to get synchronization back in order. Meanwhile the login and use of the webclient is back. Users can login and use the facilities.

Please keep an eye on this blog, as there might be slipstream performance issues. The synchronization use a lot of resources and will most probably slow down the systems while running.

Tags: Incident Report, Web Client

Incident report: Spool overflow filling up disk..

Posted on: July 2nd, 2024 by Mads Petersen

On Monday 2024-07-01 early evening (CEST) a spammer attempted to use a Kolab Now account for sending out large amounts of spam. The Kolab Now exit spam filter was sorting out the spam and redirecting it, as it was supposed to do, and none of the spam was sent out. The spammer was however stubborn and kept up the sending, which subsequently was filling up a disk and hence blocking traffic. Due to ongoing maintenance on the monitoring, the full disk was unfortunately not discovered until Tuesday morning, when the problem was immediately corrected, and queued mails were again flowing in both directions.

The missing monitoring has been put back into action, and the Kolab Now Engineering team is evaluating changes that will prevent the situation from repeating.

We apologize for any inconvenience that this incident may have caused.

Tags: Anti-Spam, Incident Report

Incident report: One external submission server overwhelmed by spam flood..

Posted on: January 8th, 2024 by Mads Petersen

On the 2024-01-07 a spammer made a large flow on one of the external submission servers. The server stopped the spam mails, and saved them to a separate holding queue to make room for other users.

It took a while for the reporting to get to the operations team, but as soon as the issue was known it was swiftly resolved at ~19:00. However, meanwhile the server ran out of space, and some users (who hit that server) would have seen that the send and receive activities failed.

We apologize for the inconvenience that this issue has caused, and will focus on improving the reporting to also cover this specific issue.

Tags: Incident Report, SPAM

Incident report: Service not restarting automatically on failure

Posted on: September 19th, 2023 by Mads Petersen

In the very late hours (UTC) of 2023-09-18 Some users experienced that they could not receive mails, or do administrative activities – like making payments or creating new users. When they tried they were given an error message: ‘Internal Server error’. This lasted until the morning (UTC) of 2023-09-19.

The issue was caused by the in-memory data store (Redis) going out of memory.

> Continue Reading

Tags: Incident Report