﻿id	summary	reporter	owner	description	type	status	priority	milestone	component	version	resolution	keywords	cc
17609	Detect invalid usage of ZWNJ/ZWJ characters	Don-vip	team	"Follow-up of #17595. We should detect incorrect uses of wikipedia:Zero-width_joiner and wikipedia:Zero-width_non-joiner characters in OSM tags:

''> In practice, zwnj is like space but with a width of zero and It does prevent adjacent characters to join each other. Although it's a valid character in Persian, yet there are some cases it appears at invalid position in a word. So It would be nice if we could keep warnings for invalid cases.''
''> ''
''> As an example think about ""aa*aaa"" as a word in Persian with zwnj included. asterisk is zwnj.''
''> ''
''> the only valid case is aa*aaa.''
''> (between two adjacent letters that can join each other)''
''> ''
''> more common cases that are invalid:''
''> ''
''> * doubled zwnj or more (like doubled space): aa^^*^^*aaa''
''> * at start or end of word: *aa*aaa or  aa*aaa*''
''> * immediately before/after space character: aa* aaa or aa *aaa (this could happen in a word, because normally we type zwnj with shift+space)''
''> * maybe a more tricky one:''
''>   * We have seven letters (و, ژ, ز, ر, ذ, د, ا) which do not connect to a following letter. So writing zwnj after them is useless and not needed. assume b is one of them. this is invalid: ab*aaa (this case could happen to other languages with similar but not the same letters)''"	enhancement	new	normal		Core validator			unicode persian arabic	
