commit
b37c575759dc4535ccc03241c584ad5fe69e3b25
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Sun Jul 4 21:06:02 2010 +0900
aufs: minor update abput the doubling donations
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
799db4b1d59ea0ffc999889bc6985397333e9a13
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Fri Jul 2 00:37:56 2010 +0900
aufs: compat_ioctl, implement the operations
(A commit in a series of supporting 32bit emulation under 64bit kernel.
While every commit is git-bisect-able, you shoule read all commits in
the series since a single commit may have less meaning.)
Implement f_op->compat_ioctl().
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
765378b55abcabfe3344e5fcf2eabd6a1d52abc0
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Fri Jul 2 00:35:17 2010 +0900
aufs: compat_ioctl, make a room for a pointer
(A commit in a series of supporting 32bit emulation under 64bit kernel.
While every commit is git-bisect-able, you shoule read all commits in
the series since a single commit may have less meaning.)
In order to make it compatible, make a room for a pointer and always
handle it as 64bit size.
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
abd339c757ea095f1affdd648d0e9e598213a790
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Fri Jul 2 00:29:00 2010 +0900
aufs: compat_ioctl, remove verifying the size of ptr
(A commit in a series of supporting 32bit emulation under 64bit kernel.
While every commit is git-bisect-able, you shoule read all commits in
the series since a single commit may have less meaning.)
Remove verifying the size of ptr which is meaningless.
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
3c979528184058b184608e6e3086c4c59b0a6c86
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Fri Jul 2 00:04:33 2010 +0900
aufs: follow 2.6.28, new flag LOOKUP_EXCL
NFS replaces the internal test for LOOKUP_CREATE by a new flag
LOOKUP_EXCL.
Aufs has to prohibit this internal test in order to know whether the
file exists or not.
Reported-by: "Ian Stakenvicius, Aerobiology Research" <ian@aerobiology.ca>
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
396a2f097a6b278fc2a9e9da83b87255f60075fb
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Tue Jun 29 14:32:26 2010 +0900
aufs: bugfix, separate the workqueue for preprocessing mmap
variation of common AB-BA deadlock problem.
ProcessA:
- aufs_mmap
+ pre-process with using a workqueue
+ wait until return from the workqueue
Workqueue task for ProcessA:
- acquire aufs rwsem
Processb
- lookup or readdir in aufs
+ acquire aufs rwsem
+ assign a new inode number
+ write the value to the XINO file using a workqueue
+ wait until return from the workqueue
Since the workqueue handles the request one by one, both of processA and
B waits forever.
This bug was introduced by the commit
d986fa5 2010-03-08
aufs: bugfix, another approach to keep the lock order of mmap_sem
which is the last added workqueue task.
And this is the only one task which acquires such lock in workqueue.
To fix it, introduce another workqueue which is for preprocessing mmap only.
This commit will make the approach more ugly, I don't have another option.
Reported-by: Oliver Welter <mail@oliwel.de>
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
7c7f493d58745e45160ccebfbfb5e6244dbd0b52
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Fri Jun 25 10:52:59 2010 +0900
aufs: tiny, debug print [if]_version
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
ade8662cb1184703ebdd7c7d07cb42d2056a605b
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Sat Jun 19 16:38:15 2010 +0900
aufs: follow the changes in 2.6.35, lockdep for sb->s_vfs_rename_mutex
lockdep_set_class() is applied to sb->s_vfs_rename_mutex, and
lockdep_off/on() in aufs become unnecessary.
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
d405a78a328658600f1928309bfbd2ded6136c59
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Sat Jun 19 03:36:21 2010 +0900
aufs: tiny, remove unused lockdep_off/on()
In linux-2.6.31, lockdep_set_class() was applited to them.
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
a4862273fc684dd24a9e0b4c29f89aaf53723afc
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Thu Jun 17 23:39:13 2010 +0900
aufs: several GIT servers
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
135ac88ed89e4780dc71bef119b803d4c594ea07
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Wed Jun 16 18:11:40 2010 +0900
aufs: possible bugfix, sbinfo lock in deleting inode, lockdep
(A commit in a series of introducing pid map/tree and making sure to
acquire sbinfo lock in deleting inode. While every commit is
git-bisect-able, you shoule read all commits in the series since a
single commit may have less meaning.)
A debugging feature in linux kernel, lockdep, warns
"inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage."
about sb->s_umount and aufs sbinfo lock.
This is bogus or "false positive" since {RECLAIM_FS-ON-W} state was
registered at allcating the root inode at mounting. This is definitly no
RECLAIM state. It may be a limitation of s_umount in lockdep.
Let's simply make it quiet.
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
6ef8250091897e1cf7e8b77a7cfc61c7ffe58b58
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Wed Jun 16 18:03:48 2010 +0900
aufs: possible bugfix, sbinfo lock in deleting inode, core
(A commit in a series of introducing pid map/tree and making sure to
acquire sbinfo lock in deleting inode. While every commit is
git-bisect-able, you shoule read all commits in the series since a
single commit may have less meaning.)
s_umount rwsem in struct super_block prevents a race condition among
umount, remount and kswapd. It is good.
But if an inode is going to be deleted by other than kswapd, aufs may be
doing another operation which requires a lock for sbinfo. In this case,
si_noflush_read_trylock() in au_iinfo_fin() will not acquire the
lock. Before au_iinfo_fin() completes or during its operations, another
operation may release the lock and remount or a branch management
process may start. Here if the branch management process changes the
union members, then xino management in au_iinfo_fin() will not work
correctly.
In order to fix this potential problem, there another bad approch is
introduced which uses a bitmap to mark the pid which acquired the sbinfo
lock. In au_iinfo_fin(), if the bit is set, then the function will not
try acquiring the lock.
To support the pid larger than PID_MAX_DEFAULT, sbinfo prepares a radix
tree too.
With this commit, we can remove si_noflush_read_trylock(). But it will
be necessary in aufs2-31 branch, so leave it now.
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
eb6254f4b0080f01b3356a40bdd62d7130bf1951
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Wed Jun 16 17:51:00 2010 +0900
aufs: possible bugfix, sbinfo lock in deleting inode, use si_pid
(A commit in a series of introducing pid map/tree and making sure to
acquire sbinfo lock in deleting inode. While every commit is
git-bisect-able, you shoule read all commits in the series since a
single commit may have less meaning.)
Use si_pid which was declared and implmented by last commit.
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
89d59298f045a8a7e2c5a667dfc3ee3b3fe8745a
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Wed Jun 16 17:43:12 2010 +0900
aufs: possible bugfix, sbinfo lock in deleting inode, si_pid functions
(A commit in a series of introducing pid map/tree and making sure to
acquire sbinfo lock in deleting inode. While every commit is
git-bisect-able, you shoule read all commits in the series since a
single commit may have less meaning.)
Declare and implement si_pid functions which are not used yet.
The pid from 1 to PID_MAX_DEFAULT are marked in a new bitmap in
sbinfo. The larger pids will go to a new radix tree in sbinfo.
These marks will be referenced by au_iinfo_fin() in succeeding commit.
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
40d554b0f6a6bc588fd88235f39b59f30ec27114
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Wed Jun 16 17:30:03 2010 +0900
aufs: possible bugfix, sbinfo lock in deleting inode, __si_ lock
(A commit in a series of introducing pid map/tree and making sure to
acquire sbinfo lock in deleting inode. While every commit is
git-bisect-able, you shoule read all commits in the series since a
single commit may have less meaning.)
Rename si_noflush_... lock macros to __si_..., and create new inlined
functions si_noflush_....
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
f2f51daf576fa90ad7209288ed8b19c8151ff92a
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Wed Jun 9 17:03:00 2010 +0900
aufs: tiny, simplify the locks in au_do_flush()
The read-lock for dinfo is unnecessary.
Also the write-lock for iinfo should be a read-lock.
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
bdc941c464fd4181f7ab76fbb992b6ab150ac188
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Tue Jun 8 23:28:03 2010 +0900
aufs: minor optimization au_iinfo_fin()
- extract a part of au_iinfo_fin(), create a new function
au_xino_delete_inode(), and remove au_iinfo_write0().
- simplify au_xino_write0() and rename to au_xib_clear_bit().
- convert the type of au_xigen_inc() into void.
- stop testing 'xino' option in au_xigen_inc().
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
7e7b86c68cf5d3638ceb19a0a1d7a254d8501bde
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Tue Jun 8 00:08:00 2010 +0900
aufs: tiny, remove an unnecessary variable
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
ea824b7a71cd905fdadcb0998ab2005fcb3786e9
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Tue Jun 8 00:06:55 2010 +0900
aufs: minor optimization, pass the 'verbose' flag
Stop testing the 'verbose' flag in all test_dentry_busy(),
test_inode_busy() and au_br_del() functions. Instead pass the tested
result from au_br_del() to test_dentry_busy() and test_inode_busy().
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
cabc97fe740d2c7664988e76f218318aecd5ec18
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Mon Jun 7 14:45:15 2010 +0900
aufs: tiny, test task flags instead of mm
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
575bf8c6484fd12a0719a7ebcf6c0d2fd0af31f3
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Fri Jun 4 14:55:42 2010 +0900
aufs: follow linux-2.6.35, simple_setsize()
Replace vmtruncate() by a new function simple_setsize().
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
c503a51d10c9c7767f1eba8e47de545d31ab7858
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Fri Jun 4 02:54:13 2010 +0900
aufs: bugfix, the dentry paramter for security funcs
Pass the correct parameter.
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
92172f59e363f630ab44217e78135cdeff90ba0d
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Fri Jun 4 02:09:23 2010 +0900
aufs: tiny, fake type-cast by union
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
9f8ad8ca00cf7677e9d982c9e98d4b7586e759f6
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Wed Jun 2 23:38:37 2010 +0900
aufs: possible bugfix, revalidate inode race between readdir and lookup
Both of readdir and lookup operation need to assign the aufs inode
number, but it requires other aufs locks including xi_nondir_mtx which
prevents hard-linked inode number from race condition. There can happen
a violation of the order of these locks.
They acquire these locks.
aufs_readdir("./dirA")
+ si_read_lock
+ fi_write_lock
+ di_write_lock
+ ii_write_lock for dirA
+ xi_nondir_mtx for non-dir
aufs_lookup("./dirA/fileB")
+ si_read_lock
+ di_write_lock
+ xi_nondir_mtx for non-dir
+ ii_write_lock_nested for fileB
Here the fileB may be in copy-up operation which acquires the parent's
dentry-info and inode->info lock. So aufs_lookup() waits for the
completion of copy-up, aufs_readdir() waits for xi_nondir_mtx, and the
copy-up waits for the parent, but it is held by readdir.
This is very complicated situation and I am afraid the design of aufs
inode assignment is not godd. But I don't have other idea.
This commit refines xi_nondir_mtx and releases it before
"ii_write_lock_nested for fileB."
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
commit
907f03a11f5a681de402d733b6e80532adece324
Author: J. R. Okajima <hooanon05@yahoo.co.jp>
Date: Tue Jun 1 01:41:43 2010 +0900
aufs: tiny, follow the changes in linux-2.6.35-rcN
The dentry parameter is removed from ->fsync().
Signed-off-by: J. R. Okajima <hooanon05@yahoo.co.jp>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
URL: http://git.c3sl.ufpr.br/pub/scm/aufs/aufs2-standalone.git
-COMMIT: a9be01e5e9688018ebe9ef46ec5414bb356bc556
+COMMIT: b37c575759dc4535ccc03241c584ad5fe69e3b25
-EXTRA_CFLAGS += -I$(src)/include
+EXTRA_CFLAGS += -I$(src)/include
include ${src}/magic.mk
ifeq (${CONFIG_AUFS_FS},m)
else
break;
- /* some filesystems acquire extra lock */
- /* lockdep_off(); */
mntput(br->br_mnt);
- /* lockdep_on(); */
-
kfree(wbr);
kfree(br);
}
* test if the branch is deletable or not.
*/
static int test_dentry_busy(struct dentry *root, aufs_bindex_t bindex,
- unsigned int sigen)
+ unsigned int sigen, const unsigned int verbose)
{
int err, i, j, ndentry;
aufs_bindex_t bstart, bend;
- unsigned char verbose;
struct au_dcsub_pages dpages;
struct au_dpage *dpage;
struct dentry *d;
if (unlikely(err))
goto out_dpages;
- verbose = !!au_opt_test(au_mntflags(root->d_sb), VERBOSE);
for (i = 0; !err && i < dpages.ndpage; i++) {
dpage = dpages.dpages + i;
ndentry = dpage->ndentry;
}
static int test_inode_busy(struct super_block *sb, aufs_bindex_t bindex,
- unsigned int sigen)
+ unsigned int sigen, const unsigned int verbose)
{
int err;
struct inode *i;
aufs_bindex_t bstart, bend;
- unsigned char verbose;
err = 0;
- verbose = !!au_opt_test(au_mntflags(sb), VERBOSE);
list_for_each_entry(i, &sb->s_inodes, i_sb_list) {
AuDebugOn(!atomic_read(&i->i_count));
if (!list_empty(&i->i_dentry))
return err;
}
-static int test_children_busy(struct dentry *root, aufs_bindex_t bindex)
+static int test_children_busy(struct dentry *root, aufs_bindex_t bindex,
+ const unsigned int verbose)
{
int err;
unsigned int sigen;
DiMustNoWaiters(root);
IiMustNoWaiters(root->d_inode);
di_write_unlock(root);
- err = test_dentry_busy(root, bindex, sigen);
+ err = test_dentry_busy(root, bindex, sigen, verbose);
if (!err)
- err = test_inode_busy(root->d_sb, bindex, sigen);
+ err = test_inode_busy(root->d_sb, bindex, sigen, verbose);
di_write_lock_child(root); /* aufs_write_lock() calls ..._child() */
return err;
}
}
- err = test_children_busy(sb->s_root, bindex);
+ err = test_children_busy(sb->s_root, bindex, verbose);
if (unlikely(err)) {
if (do_wh)
goto out_wh;
struct file *au_xino_create2(struct file *base_file, struct file *copy_src);
struct file *au_xino_create(struct super_block *sb, char *fname, int silent);
ino_t au_xino_new_ino(struct super_block *sb);
-int au_xino_write0(struct super_block *sb, aufs_bindex_t bindex, ino_t h_ino,
- ino_t ino);
+void au_xino_delete_inode(struct inode *inode, const int unlinked);
int au_xino_write(struct super_block *sb, aufs_bindex_t bindex, ino_t h_ino,
ino_t ino);
int au_xino_read(struct super_block *sb, aufs_bindex_t bindex, ino_t h_ino,
{
int err, symlen;
mm_segment_t old_fs;
- char *sym;
+ union {
+ char *k;
+ char __user *u;
+ } sym;
err = -ENOSYS;
if (unlikely(!h_src->d_inode->i_op->readlink))
goto out;
err = -ENOMEM;
- sym = __getname_gfp(GFP_NOFS);
- if (unlikely(!sym))
+ sym.k = __getname_gfp(GFP_NOFS);
+ if (unlikely(!sym.k))
goto out;
old_fs = get_fs();
set_fs(KERNEL_DS);
- symlen = h_src->d_inode->i_op->readlink(h_src, (char __user *)sym,
- PATH_MAX);
+ symlen = h_src->d_inode->i_op->readlink(h_src, sym.u, PATH_MAX);
err = symlen;
set_fs(old_fs);
if (symlen > 0) {
- sym[symlen] = 0;
- err = vfsub_symlink(h_dir, h_path, sym);
+ sym.k[symlen] = 0;
+ err = vfsub_symlink(h_dir, h_path, sym.k);
}
- __putname(sym);
+ __putname(sym.k);
out:
return err;
}
dpri("i%d: i%lu, %s, cnt %d, nl %u, 0%o, sz %llu, blk %llu,"
- " ct %lld, np %lu, st 0x%lx, f 0x%x, g %x%s%.*s\n",
+ " ct %lld, np %lu, st 0x%lx, f 0x%x, v %llu, g %x%s%.*s\n",
bindex,
inode->i_ino, inode->i_sb ? au_sbtype(inode->i_sb) : "??",
atomic_read(&inode->i_count), inode->i_nlink, inode->i_mode,
i_size_read(inode), (unsigned long long)inode->i_blocks,
(long long)timespec_to_ns(&inode->i_ctime) & 0x0ffff,
inode->i_mapping ? inode->i_mapping->nrpages : 0,
- inode->i_state, inode->i_flags, inode->i_generation,
+ inode->i_state, inode->i_flags, inode->i_version,
+ inode->i_generation,
l ? ", wh " : "", l, n);
return 0;
}
&& au_fi(file))
snprintf(a, sizeof(a), ", mmapped %d",
!!au_fi(file)->fi_hvmop);
- dpri("f%d: mode 0x%x, flags 0%o, cnt %ld, pos %llu%s\n",
+ dpri("f%d: mode 0x%x, flags 0%o, cnt %ld, v %llu, pos %llu%s\n",
bindex, file->f_mode, file->f_flags, (long)file_count(file),
- file->f_pos, a);
+ file->f_version, file->f_pos, a);
if (file->f_dentry)
do_pri_dentry(bindex, file->f_dentry);
return 0;
* due to whiteout and branch permission.
*/
h_nd->flags &= ~(/*LOOKUP_PARENT |*/ LOOKUP_OPEN | LOOKUP_CREATE
- | LOOKUP_FOLLOW);
+ | LOOKUP_FOLLOW | LOOKUP_EXCL);
/* unnecessary? */
h_nd->intent.open.file = NULL;
} else
/* ---------------------------------------------------------------------- */
-#if 0
static int au_do_fsync_dir_no_file(struct dentry *dentry, int datasync)
{
int err;
err = filemap_fdatawrite(h_inode->i_mapping);
AuDebugOn(!h_inode->i_fop);
if (!err && h_inode->i_fop->fsync)
- err = h_inode->i_fop->fsync(NULL, h_path.dentry,
- datasync);
+ err = h_inode->i_fop->fsync(NULL, datasync);
if (!err)
err = filemap_fdatawrite(h_inode->i_mapping);
if (!err)
return err;
}
-#endif
static int au_do_fsync_dir(struct file *file, int datasync)
{
static int aufs_fsync_dir(struct file *file, int datasync)
{
int err;
- struct super_block *sb;
struct dentry *dentry;
+ struct super_block *sb;
- if (!file) {
- WARN_ON(1);
- return -ENOTSUPP;
- }
dentry = file->f_dentry;
-
IMustLock(dentry->d_inode);
err = 0;
si_noflush_read_lock(sb);
if (file)
err = au_do_fsync_dir(file, datasync);
-/*
else {
di_write_lock_child(dentry);
err = au_do_fsync_dir_no_file(dentry, datasync);
}
-*/
au_cpup_attr_timesizes(dentry->d_inode);
di_write_unlock(dentry);
if (file)
di_read_unlock(dentry, AuLock_IR);
si_read_unlock(sb);
- /* lockdep_off(); */
err = au_vdir_fill_de(file, dirent, filldir);
- /* lockdep_on(); */
fsstack_copy_attr_atime(inode, h_inode);
fi_write_unlock(file);
.read = generic_read_dir,
.readdir = aufs_readdir,
.unlocked_ioctl = aufs_ioctl_dir,
+#ifdef CONFIG_COMPAT
+ .compat_ioctl = aufs_compat_ioctl_dir,
+#endif
.open = aufs_open_dir,
.release = aufs_release_dir,
.flush = aufs_flush_dir,
#ifdef CONFIG_AUFS_RDU
/* rdu.c */
long au_rdu_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
+#ifdef CONFIG_COMPAT
+long au_rdu_compat_ioctl(struct file *file, unsigned int cmd,
+ unsigned long arg);
+#endif
#else
static inline long au_rdu_ioctl(struct file *file, unsigned int cmd,
unsigned long arg)
{
return -EINVAL;
}
+#ifdef CONFIG_COMPAT
+static inline long au_rdu_compat_ioctl(struct file *file, unsigned int cmd,
+ unsigned long arg)
+{
+ return -EINVAL;
+}
+#endif
#endif
#endif /* __KERNEL__ */
/* ---------------------------------------------------------------------- */
/* inode generation external table */
-int au_xigen_inc(struct inode *inode)
+void au_xigen_inc(struct inode *inode)
{
- int err;
loff_t pos;
ssize_t sz;
__u32 igen;
struct super_block *sb;
struct au_sbinfo *sbinfo;
- err = 0;
sb = inode->i_sb;
- sbinfo = au_sbi(sb);
- /*
- * temporary workaround for escaping from SiMustAnyLock() in
- * au_mntflags(), since this function is called from au_iinfo_fin().
- */
- if (unlikely(!au_opt_test(sbinfo->si_mntflags, XINO)))
- goto out;
+ AuDebugOn(!au_opt_test(au_mntflags(sb), XINO));
+ sbinfo = au_sbi(sb);
pos = inode->i_ino;
pos *= sizeof(igen);
igen = inode->i_generation + 1;
sz = xino_fwrite(sbinfo->si_xwrite, sbinfo->si_xigen, &igen,
sizeof(igen), &pos);
if (sz == sizeof(igen))
- goto out; /* success */
+ return; /* success */
- err = sz;
- if (unlikely(sz >= 0)) {
- err = -EIO;
+ if (unlikely(sz >= 0))
AuIOErr("xigen error (%zd)\n", sz);
- }
-
- out:
- return err;
}
int au_xigen_new(struct inode *inode)
.errp = &err
};
- wkq_err = au_wkq_wait(au_call_mmap_pre, &args);
+ wkq_err = au_wkq_wait_pre(au_call_mmap_pre, &args);
if (unlikely(wkq_err))
err = wkq_err;
if (unlikely(err))
{
int err;
struct au_pin pin;
+ struct dentry *dentry;
struct inode *inode;
struct file *h_file;
struct super_block *sb;
- struct dentry *dentry = file->f_dentry;
+ dentry = file->f_dentry;
inode = dentry->d_inode;
IMustLock(file->f_mapping->host);
if (inode != file->f_mapping->host) {
err = -EINVAL;
h_file = au_hf_top(file);
if (h_file->f_op && h_file->f_op->fsync) {
- struct dentry *h_d;
struct mutex *h_mtx;
/*
* no filemap_fdatawrite() since aufs file has no its own
* mapping, but dir.
*/
- h_d = h_file->f_dentry;
- h_mtx = &h_d->d_inode->i_mutex;
+ h_mtx = &h_file->f_dentry->d_inode->i_mutex;
mutex_lock_nested(h_mtx, AuLsc_I_CHILD);
err = h_file->f_op->fsync(h_file, datasync);
if (!err)
.poll = aufs_poll,
#endif
.unlocked_ioctl = aufs_ioctl_nondir,
+#ifdef CONFIG_COMPAT
+ .compat_ioctl = aufs_ioctl_nondir, /* same */
+#endif
.mmap = aufs_mmap,
.open = aufs_open_nondir,
.flush = aufs_flush_nondir,
inode = dentry->d_inode;
si_noflush_read_lock(sb);
fi_read_lock(file);
- di_read_lock_child(dentry, AuLock_IW);
+ ii_read_lock_child(inode);
err = flush(file, id);
au_cpup_attr_timesizes(inode);
- di_read_unlock(dentry, AuLock_IW);
+ ii_read_unlock(inode);
fi_read_unlock(file);
si_read_unlock(sb);
return err;
/* ioctl.c */
long aufs_ioctl_nondir(struct file *file, unsigned int cmd, unsigned long arg);
+#ifdef CONFIG_COMPAT
+long aufs_compat_ioctl_dir(struct file *file, unsigned int cmd,
+ unsigned long arg);
+#endif
/* ---------------------------------------------------------------------- */
err = -ENOMEM;
/* iput() and kfree() will be called in au_hnotify() */
- /*
- * inotify_mutex is already acquired and kmalloc/prune_icache may lock
- * iprune_mutex. strange.
- */
- /* lockdep_off(); */
args = kmalloc(sizeof(*args) + len + 1, GFP_NOFS);
- /* lockdep_on(); */
if (unlikely(!args)) {
AuErr1("no memory\n");
iput(dir);
p[len] = 0;
}
- /* lockdep_off(); */
err = au_wkq_nowait(au_hn_bh, args, dir->i_sb);
- /* lockdep_on(); */
if (unlikely(err)) {
pr_err("wkq %d\n", err);
iput(args->h_child_inode);
struct nameidata *nd)
{
struct dentry *ret, *parent;
- struct inode *inode, *h_inode;
- struct mutex *mtx;
+ struct inode *inode;
struct super_block *sb;
int err, npositive;
- aufs_bindex_t bstart;
IMustLock(dir);
inode = NULL;
if (npositive) {
- bstart = au_dbstart(dentry);
- h_inode = au_h_dptr(dentry, bstart)->d_inode;
- if (!S_ISDIR(h_inode->i_mode)) {
- /*
- * stop 'race'-ing between hardlinks under different
- * parents.
- */
- mtx = &au_sbr(sb, bstart)->br_xino.xi_nondir_mtx;
- mutex_lock(mtx);
- inode = au_new_inode(dentry, /*must_new*/0);
- mutex_unlock(mtx);
- } else
- inode = au_new_inode(dentry, /*must_new*/0);
+ inode = au_new_inode(dentry, /*must_new*/0);
ret = (void *)inode;
}
if (IS_ERR(inode))
if (ia->ia_size < i_size_read(inode)) {
/* unmap only */
- err = vmtruncate(inode, ia->ia_size);
+ err = simple_setsize(inode, ia->ia_size);
if (unlikely(err))
goto out_unlock;
}
static void *aufs_follow_link(struct dentry *dentry, struct nameidata *nd)
{
int err;
- char *buf;
mm_segment_t old_fs;
+ union {
+ char *k;
+ char __user *u;
+ } buf;
err = -ENOMEM;
- buf = __getname_gfp(GFP_NOFS);
- if (unlikely(!buf))
+ buf.k = __getname_gfp(GFP_NOFS);
+ if (unlikely(!buf.k))
goto out;
aufs_read_lock(dentry, AuLock_IR);
old_fs = get_fs();
set_fs(KERNEL_DS);
- err = h_readlink(dentry, au_dbstart(dentry), (char __user *)buf,
- PATH_MAX);
+ err = h_readlink(dentry, au_dbstart(dentry), buf.u, PATH_MAX);
set_fs(old_fs);
aufs_read_unlock(dentry, AuLock_IR);
if (err >= 0) {
- buf[err] = 0;
+ buf.k[err] = 0;
/* will be freed by put_link */
- nd_set_link(nd, buf);
+ nd_set_link(nd, buf.k);
return NULL; /* success */
}
- __putname(buf);
+ __putname(buf.k);
out:
path_put(&nd->path);
return err;
}
-static int au_iinfo_write0(struct super_block *sb, struct au_hinode *hinode,
- ino_t ino)
-{
- int err;
- aufs_bindex_t bindex;
- unsigned char locked;
-
- err = 0;
- locked = !!si_noflush_read_trylock(sb);
- bindex = au_br_index(sb, hinode->hi_id);
- if (bindex >= 0)
- err = au_xino_write0(sb, bindex, hinode->hi_inode->i_ino, ino);
- /* error action? */
- if (locked)
- si_read_unlock(sb);
- return err;
-}
-
void au_iinfo_fin(struct inode *inode)
{
- ino_t ino;
- aufs_bindex_t bend;
- unsigned char unlinked = !inode->i_nlink;
struct au_iinfo *iinfo;
struct au_hinode *hi;
struct super_block *sb;
-
- if (unlinked) {
- int err = au_xigen_inc(inode);
- if (unlikely(err))
- AuWarn1("failed resetting i_generation, %d\n", err);
- }
+ aufs_bindex_t bindex, bend;
+ const unsigned char unlinked = !inode->i_nlink;
iinfo = au_ii(inode);
/* bad_inode case */
if (!iinfo)
return;
+ sb = inode->i_sb;
+ if (si_pid_test(sb))
+ au_xino_delete_inode(inode, unlinked);
+ else {
+ /*
+ * it is safe to hide the dependency between sbinfo and
+ * sb->s_umount.
+ */
+ lockdep_off();
+ si_noflush_read_lock(sb);
+ au_xino_delete_inode(inode, unlinked);
+ si_read_unlock(sb);
+ lockdep_on();
+ }
+
if (iinfo->ii_vdir)
au_vdir_free(iinfo->ii_vdir);
- if (iinfo->ii_bstart >= 0) {
- sb = inode->i_sb;
- ino = 0;
- if (unlinked)
- ino = inode->i_ino;
- hi = iinfo->ii_hinode + iinfo->ii_bstart;
+ bindex = iinfo->ii_bstart;
+ if (bindex >= 0) {
+ hi = iinfo->ii_hinode + bindex;
bend = iinfo->ii_bend;
- while (iinfo->ii_bstart++ <= bend) {
- if (hi->hi_inode) {
- if (unlinked || !hi->hi_inode->i_nlink) {
- au_iinfo_write0(sb, hi, ino);
- /* ignore this error */
- ino = 0;
- }
+ while (bindex++ <= bend) {
+ if (hi->hi_inode)
au_hiput(hi);
- }
hi++;
}
}
-
kfree(iinfo->ii_hinode);
AuRwDestroy(&iinfo->ii_rwsem);
}
#include <linux/limits.h>
#include <linux/types.h>
-#define AUFS_VERSION "2-standalone.tree-35-rcN-20100531"
+#define AUFS_VERSION "2-standalone.tree-35-rcN-20100705"
/* todo? move this to linux-2.6.19/include/magic.h */
#define AUFS_SUPER_MAGIC ('a' << 24 | 'u' << 16 | 'f' << 8 | 's')
#define AUFS_RDBLK_DEF 512 /* bytes */
#define AUFS_RDHASH_DEF 32
#define AUFS_WKQ_NAME AUFS_NAME "d"
+#define AUFS_WKQ_PRE_NAME AUFS_WKQ_NAME "_pre"
#define AUFS_MFS_SECOND_DEF 30 /* seconds */
#define AUFS_PLINK_WARN 100 /* number of plinks */
union au_rdu_ent_ul {
struct au_rdu_ent __user *e;
- unsigned long ul;
+ __u64 ul;
};
enum {
AufsCtlRduV_SZ,
- AufsCtlRduV_SZ_PTR,
AufsCtlRduV_End
};
{
int err;
struct mutex *mtx;
- const int isdir = (d_type == DT_DIR);
- /* prevent hardlinks from race condition */
+ /* prevent hardlinked inode number from race condition */
mtx = NULL;
- if (!isdir) {
+ if (d_type != DT_DIR) {
mtx = &au_sbr(sb, bindex)->br_xino.xi_nondir_mtx;
mutex_lock(mtx);
}
}
out:
- if (!isdir)
+ if (mtx)
mutex_unlock(mtx);
return err;
}
/* todo: return with unlocked? */
struct inode *au_new_inode(struct dentry *dentry, int must_new)
{
- struct inode *inode;
+ struct inode *inode, *h_inode;
struct dentry *h_dentry;
struct super_block *sb;
+ struct mutex *mtx;
ino_t h_ino, ino;
int err, match;
aufs_bindex_t bstart;
sb = dentry->d_sb;
bstart = au_dbstart(dentry);
h_dentry = au_h_dptr(dentry, bstart);
- h_ino = h_dentry->d_inode->i_ino;
+ h_inode = h_dentry->d_inode;
+ h_ino = h_inode->i_ino;
+
+ /*
+ * stop 'race'-ing between hardlinks under different
+ * parents.
+ */
+ mtx = NULL;
+ if (!S_ISDIR(h_inode->i_mode))
+ mtx = &au_sbr(sb, bstart)->br_xino.xi_nondir_mtx;
+
+ new_ino:
+ if (mtx)
+ mutex_lock(mtx);
err = au_xino_read(sb, bstart, h_ino, &ino);
inode = ERR_PTR(err);
if (unlikely(err))
goto out;
- new_ino:
+
if (!ino) {
ino = au_xino_new_ino(sb);
if (unlikely(!ino)) {
iget_failed(inode);
goto out_err;
} else if (!must_new) {
+ /*
+ * horrible race condition between lookup, readdir and copyup
+ * (or something).
+ */
+ if (mtx)
+ mutex_unlock(mtx);
err = reval_inode(inode, dentry, &match);
- if (!err)
+ if (!err) {
+ mtx = NULL;
goto out; /* success */
- else if (match)
+ } else if (match) {
+ mtx = NULL;
goto out_iput;
+ } else if (mtx)
+ mutex_lock(mtx);
}
if (unlikely(au_test_fs_unique_ino(h_dentry->d_inode)))
err = au_xino_write(sb, bstart, h_ino, /*ino*/0);
if (!err) {
iput(inode);
+ if (mtx)
+ mutex_unlock(mtx);
goto new_ino;
}
out_err:
inode = ERR_PTR(err);
out:
+ if (mtx)
+ mutex_unlock(mtx);
return inode;
}
AuTraceErr(err);
return err;
}
+
+#ifdef CONFIG_COMPAT
+long aufs_compat_ioctl_dir(struct file *file, unsigned int cmd,
+ unsigned long arg)
+{
+ long err;
+
+ switch (cmd) {
+ case AUFS_CTL_RDU:
+ case AUFS_CTL_RDU_INO:
+ err = au_rdu_compat_ioctl(file, cmd, arg);
+ break;
+
+ default:
+ err = aufs_ioctl_dir(file, cmd, arg);
+ }
+
+ AuTraceErr(err);
+ return err;
+}
+
+#if 0 /* unused yet */
+long aufs_compat_ioctl_nondir(struct file *file, unsigned int cmd,
+ unsigned long arg)
+{
+ return aufs_ioctl_nondir(file, cmd, (unsigned long)compat_ptr(arg));
+}
+#endif
+#endif
/* true if a kernel thread named 'loop[0-9].*' accesses a file */
int au_test_loopback_kthread(void)
{
- const char c = current->comm[4];
+ int ret;
- return current->mm == NULL
- && '0' <= c && c <= '9'
- && strncmp(current->comm, "loop", 4) == 0;
+ ret = 0;
+ if (current->flags & PF_KTHREAD) {
+ const char c = current->comm[4];
+ ret = ('0' <= c && c <= '9'
+ && !strncmp(current->comm, "loop", 4));
+ }
+
+ return ret;
}
{
long err;
struct super_block *sb;
- struct au_sbinfo *sbinfo;
err = -EACCES;
if (!capable(CAP_SYS_ADMIN))
err = 0;
sb = file->f_dentry->d_sb;
- sbinfo = au_sbi(sb);
switch (cmd) {
case AUFS_CTL_PLINK_MAINT:
/*
break;
case AUFS_CTL_PLINK_CLEAN:
aufs_write_lock(sb->s_root);
- if (au_opt_test(sbinfo->si_mntflags, PLINK))
+ if (au_opt_test(au_mntflags(sb), PLINK))
au_plink_put(sb);
aufs_write_unlock(sb->s_root);
break;
* readdir in userspace.
*/
+#include <linux/compat.h>
#include <linux/fs_stack.h>
#include <linux/security.h>
#include <linux/uaccess.h>
static int au_rdu_verify(struct aufs_rdu *rdu)
{
- AuDbg("rdu{%llu, %p, (%u, %u) | %u | %llu, %u, %u | "
+ AuDbg("rdu{%llu, %p, %u | %u | %llu, %u, %u | "
"%llu, b%d, 0x%x, g%u}\n",
- rdu->sz, rdu->ent.e, rdu->verify[0], rdu->verify[1],
+ rdu->sz, rdu->ent.e, rdu->verify[AufsCtlRduV_SZ],
rdu->blk,
rdu->rent, rdu->shwh, rdu->full,
rdu->cookie.h_pos, rdu->cookie.bindex, rdu->cookie.flags,
rdu->cookie.generation);
- if (rdu->verify[AufsCtlRduV_SZ] == sizeof(*rdu)
- && rdu->verify[AufsCtlRduV_SZ_PTR] == sizeof(rdu))
+ if (rdu->verify[AufsCtlRduV_SZ] == sizeof(*rdu))
return 0;
- AuDbg("%u:%u, %u:%u\n",
- rdu->verify[AufsCtlRduV_SZ], (unsigned int)sizeof(*rdu),
- rdu->verify[AufsCtlRduV_SZ_PTR], (unsigned int)sizeof(rdu));
+ AuDbg("%u:%u\n",
+ rdu->verify[AufsCtlRduV_SZ], (unsigned int)sizeof(*rdu));
return -EINVAL;
}
AuTraceErr(err);
return err;
}
+
+#ifdef CONFIG_COMPAT
+long au_rdu_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+ long err, e;
+ struct aufs_rdu rdu;
+ void __user *p = compat_ptr(arg);
+
+ /* todo: get_user()? */
+ err = copy_from_user(&rdu, p, sizeof(rdu));
+ if (unlikely(err)) {
+ err = -EFAULT;
+ AuTraceErr(err);
+ goto out;
+ }
+ rdu.ent.e = compat_ptr(rdu.ent.ul);
+ err = au_rdu_verify(&rdu);
+ if (unlikely(err))
+ goto out;
+
+ switch (cmd) {
+ case AUFS_CTL_RDU:
+ err = au_rdu(file, &rdu);
+ if (unlikely(err))
+ break;
+
+ rdu.ent.ul = ptr_to_compat(rdu.ent.e);
+ rdu.tail.ul = ptr_to_compat(rdu.tail.e);
+ e = copy_to_user(p, &rdu, sizeof(rdu));
+ if (unlikely(e)) {
+ err = -EFAULT;
+ AuTraceErr(err);
+ }
+ break;
+ case AUFS_CTL_RDU_INO:
+ err = au_rdu_ino(file, &rdu);
+ break;
+
+ default:
+ /* err = -ENOTTY; */
+ err = -EINVAL;
+ }
+
+ out:
+ AuTraceErr(err);
+ return err;
+}
+#endif
{
struct au_sbinfo *sbinfo;
struct super_block *sb;
+ char *locked __maybe_unused; /* debug only */
sbinfo = container_of(kobj, struct au_sbinfo, si_kobj);
AuDebugOn(!list_empty(&sbinfo->si_plink.head));
si_write_lock(sb);
au_xino_clr(sb);
au_br_free(sbinfo);
+ si_write_unlock(sb);
+
+ AuDebugOn(radix_tree_gang_lookup
+ (&sbinfo->au_si_pid.tree, (void **)&locked,
+ /*first_index*/PID_MAX_DEFAULT - 1,
+ /*max_items*/sizeof(locked)/sizeof(*locked)));
+
kfree(sbinfo->si_branch);
+ kfree(sbinfo->au_si_pid.bitmap);
mutex_destroy(&sbinfo->si_xib_mtx);
- si_write_unlock(sb);
AuRwDestroy(&sbinfo->si_rwsem);
kfree(sbinfo);
if (unlikely(!sbinfo))
goto out;
+ BUILD_BUG_ON(sizeof(unsigned long) !=
+ sizeof(*sbinfo->au_si_pid.bitmap));
+ sbinfo->au_si_pid.bitmap = kcalloc(BITS_TO_LONGS(PID_MAX_DEFAULT),
+ sizeof(*sbinfo->au_si_pid.bitmap),
+ GFP_NOFS);
+ if (unlikely(!sbinfo->au_si_pid.bitmap))
+ goto out_sbinfo;
+
/* will be reallocated separately */
sbinfo->si_branch = kzalloc(sizeof(*sbinfo->si_branch), GFP_NOFS);
if (unlikely(!sbinfo->si_branch))
- goto out_sbinfo;
+ goto out_pidmap;
err = sysaufs_si_init(sbinfo);
if (unlikely(err))
au_nwt_init(&sbinfo->si_nowait);
au_rw_init_wlock(&sbinfo->si_rwsem);
+ spin_lock_init(&sbinfo->au_si_pid.tree_lock);
+ INIT_RADIX_TREE(&sbinfo->au_si_pid.tree, GFP_ATOMIC | __GFP_NOFAIL);
+
sbinfo->si_bend = -1;
sbinfo->si_wbr_copyup = AuWbrCopyup_Def;
/* leave other members for sysaufs and si_mnt. */
sbinfo->si_sb = sb;
sb->s_fs_info = sbinfo;
+ si_pid_set(sb);
au_debug_sbinfo_init(sbinfo);
return 0; /* success */
out_br:
kfree(sbinfo->si_branch);
+ out_pidmap:
+ kfree(sbinfo->au_si_pid.bitmap);
out_sbinfo:
kfree(sbinfo);
out:
di_write_unlock2(d1, d2);
si_read_unlock(d1->d_sb);
}
+
+/* ---------------------------------------------------------------------- */
+
+int si_pid_test_slow(struct super_block *sb)
+{
+ void *p;
+
+ rcu_read_lock();
+ p = radix_tree_lookup(&au_sbi(sb)->au_si_pid.tree, current->pid);
+ rcu_read_unlock();
+
+ return (long)p;
+}
+
+void si_pid_set_slow(struct super_block *sb)
+{
+ int err;
+ struct au_sbinfo *sbinfo;
+
+ AuDebugOn(si_pid_test_slow(sb));
+
+ sbinfo = au_sbi(sb);
+ err = radix_tree_preload(GFP_NOFS | __GFP_NOFAIL);
+ AuDebugOn(err);
+ spin_lock(&sbinfo->au_si_pid.tree_lock);
+ err = radix_tree_insert(&sbinfo->au_si_pid.tree, current->pid,
+ (void *)1);
+ spin_unlock(&sbinfo->au_si_pid.tree_lock);
+ AuDebugOn(err);
+ radix_tree_preload_end();
+}
+
+void si_pid_clr_slow(struct super_block *sb)
+{
+ void *p;
+ struct au_sbinfo *sbinfo;
+
+ AuDebugOn(!si_pid_test_slow(sb));
+
+ sbinfo = au_sbi(sb);
+ spin_lock(&sbinfo->au_si_pid.tree_lock);
+ p = radix_tree_delete(&sbinfo->au_si_pid.tree, current->pid);
+ spin_unlock(&sbinfo->au_si_pid.tree_lock);
+ AuDebugOn(1 != (long)p);
+}
static const struct super_operations aufs_sop = {
.alloc_inode = aufs_alloc_inode,
.destroy_inode = aufs_destroy_inode,
+ /* always deleting, no clearing */
.drop_inode = generic_delete_inode,
.show_options = aufs_show_options,
.statfs = aufs_statfs,
/* nowait tasks in the system-wide workqueue */
struct au_nowait_tasks si_nowait;
+ /*
+ * tried sb->s_umount, but failed due to the dependecy between i_mutex.
+ * rwsem for au_sbinfo is necessary.
+ */
struct au_rwsem si_rwsem;
+ /* prevent recursive locking in deleting inode */
+ struct {
+ unsigned long *bitmap;
+ spinlock_t tree_lock;
+ struct radix_tree_root tree;
+ } au_si_pid;
+
/* branch management */
unsigned int si_generation;
void aufs_read_and_write_lock2(struct dentry *d1, struct dentry *d2, int isdir);
void aufs_read_and_write_unlock2(struct dentry *d1, struct dentry *d2);
+int si_pid_test_slow(struct super_block *sb);
+void si_pid_set_slow(struct super_block *sb);
+void si_pid_clr_slow(struct super_block *sb);
+
/* wbr_policy.c */
extern struct au_wbr_copyup_operations au_wbr_copyup_ops[];
extern struct au_wbr_create_operations au_wbr_create_ops[];
static inline int au_test_nfsd(struct task_struct *tsk)
{
- return !tsk->mm && !strcmp(tsk->comm, "nfsd");
+ return (current->flags & PF_KTHREAD)
+ && !strcmp(tsk->comm, "nfsd");
}
-int au_xigen_inc(struct inode *inode);
+void au_xigen_inc(struct inode *inode);
int au_xigen_new(struct inode *inode);
int au_xigen_set(struct super_block *sb, struct file *base);
void au_xigen_clr(struct super_block *sb);
#else
AuStubVoid(au_export_init, struct super_block *sb)
AuStubInt0(au_test_nfsd, struct task_struct *tsk)
-AuStubInt0(au_xigen_inc, struct inode *inode)
+AuStubVoid(au_xigen_inc, struct inode *inode)
AuStubInt0(au_xigen_new, struct inode *inode)
AuStubInt0(au_xigen_set, struct super_block *sb, struct file *base)
AuStubVoid(au_xigen_clr, struct super_block *sb)
/* ---------------------------------------------------------------------- */
+static inline pid_t si_pid_bit(void)
+{
+ /* the origin of pid is 1, but the bitmap's is 0 */
+ return current->pid - 1;
+}
+
+static inline int si_pid_test(struct super_block *sb)
+{
+ pid_t bit = si_pid_bit();
+ if (bit < PID_MAX_DEFAULT)
+ return test_bit(bit, au_sbi(sb)->au_si_pid.bitmap);
+ else
+ return si_pid_test_slow(sb);
+}
+
+static inline void si_pid_set(struct super_block *sb)
+{
+ pid_t bit = si_pid_bit();
+ if (bit < PID_MAX_DEFAULT) {
+ AuDebugOn(test_bit(bit, au_sbi(sb)->au_si_pid.bitmap));
+ set_bit(bit, au_sbi(sb)->au_si_pid.bitmap);
+ /* smp_mb(); */
+ } else
+ si_pid_set_slow(sb);
+}
+
+static inline void si_pid_clr(struct super_block *sb)
+{
+ pid_t bit = si_pid_bit();
+ if (bit < PID_MAX_DEFAULT) {
+ AuDebugOn(!test_bit(bit, au_sbi(sb)->au_si_pid.bitmap));
+ clear_bit(bit, au_sbi(sb)->au_si_pid.bitmap);
+ /* smp_mb(); */
+ } else
+ si_pid_clr_slow(sb);
+}
+
+/* ---------------------------------------------------------------------- */
+
/* lock superblock. mainly for entry point functions */
/*
- * si_noflush_read_lock, si_noflush_write_lock,
- * si_read_unlock, si_write_unlock, si_downgrade_lock
+ * __si_read_lock, __si_write_lock,
+ * __si_read_unlock, __si_write_unlock, __si_downgrade_lock
*/
-AuSimpleLockRwsemFuncs(si_noflush, struct super_block *sb,
- &au_sbi(sb)->si_rwsem);
-AuSimpleUnlockRwsemFuncs(si, struct super_block *sb, &au_sbi(sb)->si_rwsem);
+AuSimpleRwsemFuncs(__si, struct super_block *sb, &au_sbi(sb)->si_rwsem);
#define SiMustNoWaiters(sb) AuRwMustNoWaiters(&au_sbi(sb)->si_rwsem)
#define SiMustAnyLock(sb) AuRwMustAnyLock(&au_sbi(sb)->si_rwsem)
#define SiMustWriteLock(sb) AuRwMustWriteLock(&au_sbi(sb)->si_rwsem)
+static inline void si_noflush_read_lock(struct super_block *sb)
+{
+ __si_read_lock(sb);
+ si_pid_set(sb);
+}
+
+static inline int si_noflush_read_trylock(struct super_block *sb)
+{
+ int locked = __si_read_trylock(sb);
+ if (locked)
+ si_pid_set(sb);
+ return locked;
+}
+
+static inline void si_noflush_write_lock(struct super_block *sb)
+{
+ __si_write_lock(sb);
+ si_pid_set(sb);
+}
+
+static inline int si_noflush_write_trylock(struct super_block *sb)
+{
+ int locked = __si_write_trylock(sb);
+ if (locked)
+ si_pid_set(sb);
+ return locked;
+}
+
static inline void si_read_lock(struct super_block *sb, int flags)
{
if (au_ftest_lock(flags, FLUSH))
si_noflush_read_lock(sb);
}
-static inline void si_write_lock(struct super_block *sb)
-{
- au_nwt_flush(&au_sbi(sb)->si_nowait);
- si_noflush_write_lock(sb);
-}
-
static inline int si_read_trylock(struct super_block *sb, int flags)
{
if (au_ftest_lock(flags, FLUSH))
return si_noflush_read_trylock(sb);
}
+static inline void si_read_unlock(struct super_block *sb)
+{
+ si_pid_clr(sb);
+ __si_read_unlock(sb);
+}
+
+static inline void si_write_lock(struct super_block *sb)
+{
+ au_nwt_flush(&au_sbi(sb)->si_nowait);
+ si_noflush_write_lock(sb);
+}
+
+#if 0 /* unused */
static inline int si_write_trylock(struct super_block *sb, int flags)
{
if (au_ftest_lock(flags, FLUSH))
au_nwt_flush(&au_sbi(sb)->si_nowait);
return si_noflush_write_trylock(sb);
}
+#endif
+
+static inline void si_write_unlock(struct super_block *sb)
+{
+ si_pid_clr(sb);
+ __si_write_unlock(sb);
+}
+
+#if 0 /* unused */
+static inline void si_downgrade_lock(struct super_block *sb)
+{
+ __si_downgrade_lock(sb);
+}
+#endif
/* ---------------------------------------------------------------------- */
{
struct file *file;
- /* lockdep_off(); */
file = filp_open(path, oflags, mode);
- /* lockdep_on(); */
if (IS_ERR(file))
goto out;
vfsub_update_h_iattr(&file->f_path, /*did*/NULL); /*ignore*/
{
int err;
- /* lockdep_off(); */
err = kern_path(name, flags, path);
- /* lockdep_on(); */
if (!err && path->dentry->d_inode)
vfsub_update_h_iattr(path, /*did*/NULL); /*ignore*/
return err;
{
struct dentry *d;
- lockdep_off();
d = lock_rename(d1, d2);
- lockdep_on();
au_hn_suspend(hdir1);
if (hdir1 != hdir2)
au_hn_suspend(hdir2);
au_hn_resume(hdir1);
if (hdir1 != hdir2)
au_hn_resume(hdir2);
- lockdep_off();
unlock_rename(d1, d2);
- lockdep_on();
}
/* ---------------------------------------------------------------------- */
d = path->dentry;
path->dentry = d->d_parent;
- err = security_path_mknod(path, path->dentry, mode, 0);
+ err = security_path_mknod(path, d, mode, 0);
path->dentry = d;
if (unlikely(err))
goto out;
d = path->dentry;
path->dentry = d->d_parent;
- err = security_path_symlink(path, path->dentry, symname);
+ err = security_path_symlink(path, d, symname);
path->dentry = d;
if (unlikely(err))
goto out;
d = path->dentry;
path->dentry = d->d_parent;
- err = security_path_mknod(path, path->dentry, mode, dev);
+ err = security_path_mknod(path, d, mode, dev);
path->dentry = d;
if (unlikely(err))
goto out;
d = path->dentry;
path->dentry = d->d_parent;
- err = security_path_link(src_dentry, path, path->dentry);
+ err = security_path_link(src_dentry, path, d);
path->dentry = d;
if (unlikely(err))
goto out;
- /* lockdep_off(); */
err = vfs_link(src_dentry, dir, path->dentry);
- /* lockdep_on(); */
if (!err) {
struct path tmp = *path;
int did;
d = path->dentry;
path->dentry = d->d_parent;
tmp.dentry = src_dentry->d_parent;
- err = security_path_rename(&tmp, src_dentry, path, path->dentry);
+ err = security_path_rename(&tmp, src_dentry, path, d);
path->dentry = d;
if (unlikely(err))
goto out;
- /* lockdep_off(); */
err = vfs_rename(src_dir, src_dentry, dir, path->dentry);
- /* lockdep_on(); */
if (!err) {
int did;
d = path->dentry;
path->dentry = d->d_parent;
- err = security_path_mkdir(path, path->dentry, mode);
+ err = security_path_mkdir(path, d, mode);
path->dentry = d;
if (unlikely(err))
goto out;
d = path->dentry;
path->dentry = d->d_parent;
- err = security_path_rmdir(path, path->dentry);
+ err = security_path_rmdir(path, d);
path->dentry = d;
if (unlikely(err))
goto out;
- /* lockdep_off(); */
err = vfs_rmdir(dir, path->dentry);
- /* lockdep_on(); */
if (!err) {
struct path tmp = {
.dentry = path->dentry->d_parent,
{
ssize_t err;
mm_segment_t oldfs;
+ union {
+ void *k;
+ char __user *u;
+ } buf;
+ buf.k = kbuf;
oldfs = get_fs();
set_fs(KERNEL_DS);
- err = vfsub_read_u(file, (char __user *)kbuf, count, ppos);
+ err = vfsub_read_u(file, buf.u, count, ppos);
set_fs(oldfs);
return err;
}
{
ssize_t err;
- /* lockdep_off(); */
err = vfs_write(file, ubuf, count, ppos);
- /* lockdep_on(); */
if (err >= 0)
vfsub_update_h_iattr(&file->f_path, /*did*/NULL); /*ignore*/
return err;
{
ssize_t err;
mm_segment_t oldfs;
+ union {
+ void *k;
+ const char __user *u;
+ } buf;
+ buf.k = kbuf;
oldfs = get_fs();
set_fs(KERNEL_DS);
- err = vfsub_write_u(file, (const char __user *)kbuf, count, ppos);
+ err = vfsub_write_u(file, buf.u, count, ppos);
set_fs(oldfs);
return err;
}
{
int err;
- /* lockdep_off(); */
err = vfs_readdir(file, filldir, arg);
- /* lockdep_on(); */
if (err >= 0)
vfsub_update_h_iattr(&file->f_path, /*did*/NULL); /*ignore*/
return err;
{
long err;
- /* lockdep_off(); */
err = do_splice_to(in, ppos, pipe, len, flags);
- /* lockdep_on(); */
file_accessed(in);
if (err >= 0)
vfsub_update_h_iattr(&in->f_path, /*did*/NULL); /*ignore*/
{
long err;
- /* lockdep_off(); */
err = do_splice_from(pipe, out, ppos, len, flags);
- /* lockdep_on(); */
if (err >= 0)
vfsub_update_h_iattr(&out->f_path, /*did*/NULL); /*ignore*/
return err;
err = locks_verify_truncate(h_inode, h_file, length);
if (!err)
err = security_path_truncate(h_path, length, attr);
- if (!err) {
- /* lockdep_off(); */
+ if (!err)
err = do_truncate(h_path->dentry, length, attr, h_file);
- /* lockdep_on(); */
- }
out_inode:
if (!h_file)
*a->errp = -EPERM;
if (!IS_IMMUTABLE(h_inode) && !IS_APPEND(h_inode)) {
- /* lockdep_off(); */
*a->errp = notify_change(a->path->dentry, a->ia);
- /* lockdep_on(); */
if (!*a->errp)
vfsub_update_h_iattr(a->path, /*did*/NULL); /*ignore*/
}
if (h_inode)
atomic_inc(&h_inode->i_count);
- /* lockdep_off(); */
*a->errp = vfs_unlink(a->dir, d);
- /* lockdep_on(); */
if (!*a->errp) {
struct path tmp = {
.dentry = d->d_parent,
{
loff_t err;
- /* lockdep_off(); */
err = vfs_llseek(file, offset, origin);
- /* lockdep_on(); */
return err;
}
#include <linux/module.h>
#include "aufs.h"
-/* internal workqueue named AUFS_WKQ_NAME */
-static struct workqueue_struct *au_wkq;
+/* internal workqueue named AUFS_WKQ_NAME and AUFS_WKQ_PRE_NAME */
+enum {
+ AuWkq_INORMAL,
+ AuWkq_IPRE
+};
+
+static struct {
+ char *name;
+ struct workqueue_struct *wkq;
+} au_wkq[] = {
+ [AuWkq_INORMAL] = {
+ .name = AUFS_WKQ_NAME
+ },
+ [AuWkq_IPRE] = {
+ .name = AUFS_WKQ_PRE_NAME
+ }
+};
struct au_wkinfo {
struct work_struct wk;
}
#endif /* 4KSTACKS */
-static void au_wkq_run(struct au_wkinfo *wkinfo, int do_wait)
+static void au_wkq_run(struct au_wkinfo *wkinfo, unsigned int flags)
{
+ struct workqueue_struct *wkq;
+
au_dbg_verify_kthread();
- if (do_wait) {
+ if (flags & AuWkq_WAIT) {
INIT_WORK_ON_STACK(&wkinfo->wk, wkq_func);
- queue_work(au_wkq, &wkinfo->wk);
+ wkq = au_wkq[AuWkq_INORMAL].wkq;
+ if (flags & AuWkq_PRE)
+ wkq = au_wkq[AuWkq_IPRE].wkq;
+ queue_work(wkq, &wkinfo->wk);
} else {
INIT_WORK(&wkinfo->wk, wkq_func);
schedule_work(&wkinfo->wk);
}
}
-int au_wkq_wait(au_wkq_func_t func, void *args)
+int au_wkq_do_wait(unsigned int flags, au_wkq_func_t func, void *args)
{
int err;
AuWkqCompDeclare(comp);
struct au_wkinfo wkinfo = {
- .flags = AuWkq_WAIT,
+ .flags = flags,
.func = func,
.args = args
};
err = au_wkq_comp_alloc(&wkinfo, &comp);
if (!err) {
- au_wkq_run(&wkinfo, AuWkq_WAIT);
+ au_wkq_run(&wkinfo, flags);
/* no timeout, no interrupt */
wait_for_completion(wkinfo.comp);
au_wkq_comp_free(comp);
void au_wkq_fin(void)
{
- destroy_workqueue(au_wkq);
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(au_wkq); i++)
+ if (au_wkq[i].wkq)
+ destroy_workqueue(au_wkq[i].wkq);
}
int __init au_wkq_init(void)
{
- au_wkq = create_workqueue(AUFS_WKQ_NAME);
- return 0;
+ int err, i;
+
+ err = 0;
+ for (i = 0; !err && i < ARRAY_SIZE(au_wkq); i++) {
+ au_wkq[i].wkq = create_workqueue(au_wkq[i].name);
+ if (IS_ERR(au_wkq[i].wkq))
+ err = PTR_ERR(au_wkq[i].wkq);
+ else if (!au_wkq[i].wkq)
+ err = -ENOMEM;
+ if (unlikely(err))
+ au_wkq[i].wkq = NULL;
+ }
+ if (unlikely(err))
+ au_wkq_fin();
+
+ return err;
}
/* wkq flags */
#define AuWkq_WAIT 1
+#define AuWkq_PRE (1 << 1)
#define au_ftest_wkq(flags, name) ((flags) & AuWkq_##name)
#define au_fset_wkq(flags, name) { (flags) |= AuWkq_##name; }
#define au_fclr_wkq(flags, name) { (flags) &= ~AuWkq_##name; }
/* wkq.c */
-int au_wkq_wait(au_wkq_func_t func, void *args);
+int au_wkq_do_wait(unsigned int flags, au_wkq_func_t func, void *args);
int au_wkq_nowait(au_wkq_func_t func, void *args, struct super_block *sb);
void au_nwt_init(struct au_nowait_tasks *nwt);
int __init au_wkq_init(void);
/* ---------------------------------------------------------------------- */
+static inline int au_wkq_wait_pre(au_wkq_func_t func, void *args)
+{
+ return au_wkq_do_wait(AuWkq_WAIT | AuWkq_PRE, func, args);
+}
+
+static inline int au_wkq_wait(au_wkq_func_t func, void *args)
+{
+ return au_wkq_do_wait(AuWkq_WAIT, func, args);
+}
+
static inline int au_test_wkq(struct task_struct *tsk)
{
- return !tsk->mm
+ return (current->flags & PF_KTHREAD)
&& !strncmp(tsk->comm, AUFS_WKQ_NAME "/",
sizeof(AUFS_WKQ_NAME));
}
#include <linux/uaccess.h>
#include "aufs.h"
-ssize_t xino_fread(au_readf_t func, struct file *file, void *buf, size_t size,
+ssize_t xino_fread(au_readf_t func, struct file *file, void *kbuf, size_t size,
loff_t *pos)
{
ssize_t err;
mm_segment_t oldfs;
+ union {
+ void *k;
+ char __user *u;
+ } buf;
+ buf.k = kbuf;
oldfs = get_fs();
set_fs(KERNEL_DS);
do {
/* todo: signal_pending? */
- err = func(file, (char __user *)buf, size, pos);
+ err = func(file, buf.u, size, pos);
} while (err == -EAGAIN || err == -EINTR);
set_fs(oldfs);
/* ---------------------------------------------------------------------- */
-static ssize_t do_xino_fwrite(au_writef_t func, struct file *file, void *buf,
+static ssize_t do_xino_fwrite(au_writef_t func, struct file *file, void *kbuf,
size_t size, loff_t *pos)
{
ssize_t err;
mm_segment_t oldfs;
+ union {
+ void *k;
+ const char __user *u;
+ } buf;
+ buf.k = kbuf;
oldfs = get_fs();
set_fs(KERNEL_DS);
- /* lockdep_off(); */
do {
/* todo: signal_pending? */
- err = func(file, (const char __user *)buf, size, pos);
+ err = func(file, buf.u, size, pos);
} while (err == -EAGAIN || err == -EINTR);
- /* lockdep_on(); */
set_fs(oldfs);
#if 0 /* reserved for future use */
/* ---------------------------------------------------------------------- */
-int au_xino_write0(struct super_block *sb, aufs_bindex_t bindex, ino_t h_ino,
- ino_t ino)
+static void au_xib_clear_bit(struct inode *inode)
{
int err, bit;
unsigned long pindex;
+ struct super_block *sb;
struct au_sbinfo *sbinfo;
- if (!au_opt_test(au_mntflags(sb), XINO))
- return 0;
+ AuDebugOn(inode->i_nlink);
- err = 0;
- if (ino) {
- sbinfo = au_sbi(sb);
- xib_calc_bit(ino, &pindex, &bit);
- AuDebugOn(page_bits <= bit);
- mutex_lock(&sbinfo->si_xib_mtx);
- err = xib_pindex(sb, pindex);
- if (!err) {
- clear_bit(bit, sbinfo->si_xib_buf);
- sbinfo->si_xib_next_bit = bit;
- }
- mutex_unlock(&sbinfo->si_xib_mtx);
+ sb = inode->i_sb;
+ xib_calc_bit(inode->i_ino, &pindex, &bit);
+ AuDebugOn(page_bits <= bit);
+ sbinfo = au_sbi(sb);
+ mutex_lock(&sbinfo->si_xib_mtx);
+ err = xib_pindex(sb, pindex);
+ if (!err) {
+ clear_bit(bit, sbinfo->si_xib_buf);
+ sbinfo->si_xib_next_bit = bit;
}
+ mutex_unlock(&sbinfo->si_xib_mtx);
+}
- if (!err)
- err = au_xino_write(sb, bindex, h_ino, 0);
- return err;
+/* for s_op->delete_inode() */
+void au_xino_delete_inode(struct inode *inode, const int unlinked)
+{
+ int err;
+ unsigned int mnt_flags;
+ aufs_bindex_t bindex, bend, bi;
+ unsigned char try_trunc;
+ struct au_iinfo *iinfo;
+ struct super_block *sb;
+ struct au_hinode *hi;
+ struct inode *h_inode;
+ struct au_branch *br;
+ au_writef_t xwrite;
+
+ sb = inode->i_sb;
+ mnt_flags = au_mntflags(sb);
+ if (!au_opt_test(mnt_flags, XINO)
+ || inode->i_ino == AUFS_ROOT_INO)
+ return;
+
+ if (unlinked) {
+ au_xigen_inc(inode);
+ au_xib_clear_bit(inode);
+ }
+
+ iinfo = au_ii(inode);
+ if (!iinfo)
+ return;
+
+ bindex = iinfo->ii_bstart;
+ if (bindex < 0)
+ return;
+
+ xwrite = au_sbi(sb)->si_xwrite;
+ try_trunc = !!au_opt_test(mnt_flags, TRUNC_XINO);
+ hi = iinfo->ii_hinode + bindex;
+ bend = iinfo->ii_bend;
+ for (; bindex <= bend; bindex++, hi++) {
+ h_inode = hi->hi_inode;
+ if (!h_inode
+ || (!unlinked && h_inode->i_nlink))
+ continue;
+
+ /* inode may not be revalidated */
+ bi = au_br_index(sb, hi->hi_id);
+ if (bi < 0)
+ continue;
+
+ br = au_sbr(sb, bi);
+ err = au_xino_do_write(xwrite, br->br_xino.xi_file,
+ h_inode->i_ino, /*ino*/0);
+ if (!err && try_trunc
+ && au_test_fs_trunc_xino(br->br_mnt->mnt_sb))
+ xino_try_trunc(sb, br);
+ }
}
/* get an unused inode number from bitmap */