2019-06-18
Video surveillance is an easy to realize application of artificial intelligence. The idea is to annotate a videostream with semantic information, parse the action words and store the found items in a database. A potential use-case is, to monitor a cartoon clip and detect what the characters are doing and how they interact with each other.
Most video surveillance systems are based on the basic technology of background subtraction. The difference between two standstill images is determined and this allows to make the motion visible. In contrast to a common misconception, the vanilla background subtraction is working great, especially under lighting variations. The idea that there is problem, which has to be overcome with a fuzzy type-2 model is wrong. But let us give the paper a chance to explain the details.[1] The assumption is, that each pixel contains of modes which are describing the membership degree to the background and foreground layer. A pixel belongs to the foreground and the background at the same time. Does this make sense? No, it's a dead end and the paper struggles to explain the mathematical background of uncertainty.
Unfortunately, there is more than a single paper available which is working with the false assumption, that there is a need for managing uncertainty. In the domain of traffic surveillance a similar concept is used. The idea is, that the traffic flow has a normal density in a certain range and an abnormal density which is above the scale.[2] Sure, it's possible to blur the borders with so called fuzzy sets, but what exactly is the purpose? Traffic surveillance has to be exact, and the system asks for concrete values. In contrast to atoms at the quantum level, the number of cars on the road can be counted.